AD-A125  8S2 
UNCLASSIFIED 


DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR  SVSTEMS  AND  '  1/2 

MARKOVIAN  DECISION  PROCESSES*!!)  ILLINOIS  UNIV  AT  URBANA 
DECISION  AND  CONTROL  LAB  R  G  PHILLIPS  NOV  SB  DC-44 
N08814-79-C-8424  F/G  12/1  NL 


K 


B 

■  '  MICROCOPY  RESOLUTION  TEST  CHART 

NATIONAL  BUREAU  OF  STANDARDS- 1963-A 


DIB  FILE  COPY  M  A  12  5  8  52 


UNIVERSITY  OF  ILLINOIS  AT  URBANA-CHAMPAIGN 


83  08  21  009  ! 

d 


UNCLASSIFIED 


'SECURITY  CLASSIFICATION  OF  TNII  MM  (TWlH  Dmm  Knitted) 


REPORT  DOCUMENTATION  PAGE 


4.  TITLE  (end  Submit) 

DECOMPOSITION  OF  TIME- SCALES  IN  LINEAR  SYSTEMS 
AND  MARKOVIAN  DECISION  PROCESSES 


7.  AUTHORS 


Randolph  Gale  Phillips 


rxad  mmtucTiOM 

BEFORE C 


1.  RECIPIENT'S  CATALOS  NUMSSI 


S.  TYPE  OF  REPORT  •  PSWOO  COVERED 


Technical  Repose 

«.  PERFORMING  ORG.  REPORT  MUMPER 

R-902 (DC-44) ;UILU-ENG-80-2233 


CONTRACT  OR  ORANT  HUMBERT •) 

N00014- 7?-C-0424 
NSF  EC^9-19396 


S.  PERFORMING  OROANIZATION  NAME  ANO  AOOREU 

Coordinated  Science  Laboratory 
University  of  Illinois  at  Urbana-Champaign 
Urbana,  Illinois  61801 


IS.  REPORT  DATE 

November,  1980 


■  3.  NUMBER  OF  PACES 

132 


MONITORING  AGENCY  NAME  A  AOORCSSfJf  dl  Iterant  from  Control  lint  Oltlet)  IS.  SECURITY  CLASS,  (a I  Ml*  report) 

UNCLASSIFIED 

IS*.  DECLASSIFICATION/ DOWNGRADING 
SCHEDULE 


1 1.  CONTROLLING  OFFICE  NAME  ANO  AOORESS 


Joint  Services  Electronics  Program 


l«.  DISTRIBUTION  STATEMENT  (at  tbla  Report) 


Approved  for  public  release;  distribution  unlimited. 


17.  DISTRIBUTION  STATEMENT  (ol  lb  a  abatraet  entered  In  Block  20,  II  dlllttanl  Item  Report)  . 


19.  KEY  WORDS  r Continue  on  reverse  r'de  if  neceaeaty  and  identify  by  block  number) 

Decomposition  Decentralized  optimization 

Time  scales 

Discrete  time  systems 
Singular  perturbation 
Markov  chains 


20.  ABSTRACT  (Continue  on  rovmrao  aid •  //  nmcaaaary  and  idanti/y  by  block  numbar) 

The  presence  of  ’’slow"  and  "fast"  dynamics  in  large  scale  systems  has  motivated 
the  use  of  singular  perturbations  as  a  means  of  obtaining  reduced  order  models 
for  analysis  and  control  law  design.  In  this  thesis  we  establish  how  systems 
having  this  "two-time-scale"  property  can  use  singular  perturbation  modeling  to 
make  this  property  explicit  enabling  various  reduced  order  analysis  and  design 
techniques  to  be  applied.  For  deterministic  linear  time-invariant  systems, 
various  techniques  for  obtaining  reduced  order  models  are  unified  through  left 
and  right  eigenspace  decompositions.  A  general  two  stage  control  design 


FORM 
I  JAN  73 


EOlTION  0»  I  NOV  SS  13  obsolete 


UNCLASSIFIED 

SECURITY  CLASSIFICATION  of  ”HlS  pace  (Whan  Date  Entered) 


UNCLASSIFIED _ 

SSCUMTV  CLASSIFICATION  OF  THIS  PAaKfWftaa  Of  bMN] 


20.  ABSTRACT  (conClnued) 


procedure  for  separate  fast  and  slow  subsystems  is  developed  which  can  be 
applied  to  both  continuous  and  discrete  time  models.  Finally,  Markov  chain 
models  of  stochastic  systems  with  "weak”  and  "strong"  trnsition  probabilities 
lead  to  a  singularly  perturbed  model  from  which  we  obtain  the  concept  of  the 
reduced  order  "aggregate"  chain.  For  controlled  Markov  chains  the  aggregate 
model  is  used  to  develop  decentralized  optimization  algorithms  for  the 
discounted  and  average  cost  per  stage  problem. 


SCCUNITV  CLASSIFICATION  OF  *HIS  PAStrUftwi  Of  tnfrl) 


UILU-ENG-80-2233 


DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR  SYSTEMS 
AND  MARKOVIAN  DECISION  PROCESSES 

by 

Randolph  Gale  Phillips 


This  research  was  supported  in  part  by  the  Joint  Services  Electronics 
Program  under  Contract  N00014-79-C-0424 ,  and  in  part  by  the  National  Science 
Foundation  under  Grant  ECS-79- 19396. 


Reproduction  in  whole  or  in  part  is  permitted  for  any  purpose  of 
the  United  States  Government. 


Approved  for  public  release.  Distribution  unlimited. 


DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR  SYSTEMS 
AND  MARKOVIAN  DECISION  PROCESSES 


BY 

RANDOLPH  GALE  PHILLIPS 

B.E.E.,  Villanova  University,  1976 
M.S.,  University  of  Illinois,  1979 


THESIS 


Submitted  in  partial  fulfillment  of  the  requirements 
for  the  degree  of  Doctor  of  Philosophy  in  Electrical  Engineering 
in  the  Graduate  College  of  the 
University  of  Illinois  at  Urbana-Champaign,  1981 


Thesis  Adviser:  Professor  P.  V.  Kokotovic 


Urbana,  Illinois 


DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR  SYSTEMS 
AND  MARKOVIAN  DECISION  PROCESSES 


Randolph  Gale  Phillips,  Ph.D. 
Department  of  Electrical  Engineering 
University  of  Illinois  at  Urbana-Champaign,  1981 


Y  4 

-flalmi11  anA  n 


...>  The  presence  of  -‘'slow'"  and  "fast*  dynamics  in  large  scale  systems 
has  motivated  the  use  of  singular  perturbations  as  a  means  of  obtaining  ^ 
reduced  order  models  for  analysis  and  control  law  design.  In  this  thesis  we~ 
establish  how  systems  having  this  '"tvo-time-scaleM~ property  can  use  singular 
perturbation  modeling  to  make  this  property  explicit  enabling  various 
reduced  order  analysis  and  design  techniques  to  be  applied.  For  deterministic 
linear  time- invariant  systems,  various  techniques  for  obtaining  reduced  order 
models  are  unified  through  left  and  right  eigenspace  decompositions .  A 
general  two  stage  control  d*  ~gn  procedure  for  separate  fast  and  slow  sub¬ 
systems  is  developed  which  can  be  applied  to  both  continuous  and  discrete 
time  models.  Finally,  Markov  chain  models  of  stochastic  systems  with  "weak*' 

and  "strong”  transition  probabilities  lead  to  a  singularly  perturbed  model 
,7  •  -  i  p  . 

from  which  we  obtain  the  concept  of  the  reduced  order^aggregate"  chain. 


For  controlled  Markov  chains  the  aggregate  model  is  used  to  develop  decen¬ 
tralized  optimization  algorithms  for  the  discounted  and  average  cost  per 
stage  problems. 


The  author  wishes  to  thank  his  advisor.  Professor  P.  V.  Ko koto vie 
for  his  guidance,  encouragement,  and  patience  during  the  course  of  this 
research.  He  would  like  to  thank  Professors  J.  B.  Cruz,  Jr.,  W.  R.  Perkins 
D.  P.  Looze,  and  B.  E.  Hajek  for  serving  on  his  dissertation  committee.  He 
would  also  like  to  thank  Mrs.  Rose  Harris  and  Mrs.  Wilma  McNeal  for  their 
excellent  typing.  Finally,  the  author  is  forever  grateful  to  his  wife,  Ann, 
and  children  Randy,  Jr.,  Derrich,  and  Kristen  for  giving  him  enough  time  to 


finish  this  research. 


TABLE  OF  CONTENTS 


Chapter  Pa; 

1.  INTRODUCTION .  1 

2.  EIGENSTRUCTURE  DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR 

TIME-INVARIANT  SYSTEMS  .  5 

2.1.  Introduction . . . . .  5 

2.2.  Left  and  Right  Eigenspace  Decompositions  .  7 

2.3.  Block  Dlagonalization  and  Identification  of 

Fast  and  Slow  State  Vector  Components  . . .  16 

2.4.  Ordering  of  State  Variables  .  20 

2.5.  Example  -  8th  Order  Power  System  Model  . .  22 

3.  ASYMPTOTIC  SERIES  DECOMPOSITION  OF  TIME  SCALES  IN  LINEAR 

TIME- INVARIANT  SYSTEMS  .... .  40 

3.1.  Introduction  .  40 

3.2.  Series  Solutions  to  Riccati  Iterations  .  42 

3.3.  Fundamental  Sets  of  Solutions  .  47 

3.4.  Asymptotic  Expansions  Using  the  Method  of  Vasileva  .  52 

3.5.  Simplified  Iterative  Methods  . ! .  65 

3.6.  Full  and  Partial  Pole  Placement  .  66 

4.  SINGULAR  PERTURBATION  MODELING  OF  MARKOV  CHAINS  . .  71 

4.1.  Introduction .  71 

4.2.  Singular  Perturbation  Modeling  .  72 

4.3.  Ordering  of  States  .  78 

4.4.  Two-Time-Scale  Asymptotic  Expansion  .  83 

5.  THE  DISCOUNTED  COST  PROBLEM  FOR  CONTROLLED  MARKOV  CHAINS  .  89 

5.1.  Introduction  .  89 

5.2.  Decomposition  of  the  Cost  for  a  Fixed  Policy  .  90 

5.3.  Near  Optimal  Control . . .  93 

5.4.  Decentralized  Optimization  .  97 

6.  THE  AVERAGE  COST  PER  STAGE  PROBLEM  FOR  CONTROLLED 

MARKOV  CHAINS  .  10: 

6.1.  Introduction  .  iq 

6.2.  Decomposition  of  the  Cost  for  a  Fixed  Policy  .  1ft 

6.3.  Near  Optimal  Control  .  10 

6.4.  Decentralized  Optimization  .  iq< 

6.5.  Example  -  Minimizing  the  Average  Cost  per  Stage  . .  11' 

7.  CONCLUSIONS  .  12; 

REFERENCES  .  19 


1 


CHAPTER  1 
INTRODUCTION 

The  analysis  and  control  of  large  scale  systems  has  always  been  a 
challenging  problem  to  control  engineers.  The  large  dimension  of  the  system 
model  and  the  rich  interactions  make  conventional  simulation  and  design* 
algorithms  impractical  if  not  impossible  to  use.  The  research  direction  in 
this  area  has  centered  on  obtaining  reduced  order  models  from  large  scale  models 
without  sacrificing  significant  accuracy.  These  reduced  order  models  are 
usually  constructed  in  one  of  two  ways.  First,  within  a  large  scale  model, 
several  smaller  subsystems  are  identified  by  their  weak  couplings  to  the 
remainder  of  the  system.  These  reduced  order  subsystems  are  treated 
separately  for  simulation  and  design  purposes  with  interactions  between  sub¬ 
systems  taken  care  of  separately  [1,2].  Second,  the  entire  large  scale  model 
is  approximated  by  one  reduced  order  model  where  the  dynamics  of  the  reduced 
order  model  are  determined  by  the  "dominant"  dynamics  of  the  large  scale 
model  [3,4],  Over  the  years,  many  names  have  been  given  to  various  methods 
of  order  reduction.  Of  these  methods  aggregation  and  singular  perturbations 
seem  to  be  the  most  well  known  [5].  The  analysis  and  design  of  singularly 
perturbed  systems  has  been  well  documented  [6,7,8].  The  multiple-time-scale 
property  of  these  systems  has  been  used  in  developing  reduced  order  models 
and  control  laws  for  high  order  "stiff"  models.  This  thesis  further  contri¬ 
butes  to  the  theory  of  multiple-time-scale  systems  and  how  they  can  be  used 
as  a  powerful  order  reduction  technique  for  both  the  analysis  and  design  of 
large  scale  systems. 
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In  Chapter  2  we  consider  the  linear  time-invariant  system 


dx(t) 

dt 


Ax(t)  +  Bu(t) 


(1.1) 


where  x(t)€Rq,  u(t)SRP,  and  {a^},  (b^}€R  Vi,j.  Many  seemingly  unrelated 
iterative  techniques  [9-12]  have  been  proposed  for  transforming  (1.1)  into 
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where  y€Rn.  weRm,  and 


min  | X±(A*> (  <  max  iyD*>, 
i  1,  • . .  ,n  j**l»  •  • .  ,m 


(1.3) 


(in  (1.3),  X^(Z)  is  the  ith  eigenvalue  of  the  matrix  Z) .  When  such  a  trans¬ 
formation  is  possible,  (1.1)  is  said  to  have  a  "two- time-scale  property." 

These  spectrum  separation  techniques  can  then  be  used  for  reduced  order 
modeling  and  design.  Unfortunately,  the  convergence  results  of  these  iterative 
techniques  are  either  heuristic  or  conservative  [10,12].  In  this  chapter  we 
unify  and  extend  the  results  of  time  scale  decomposition  techniques  to  form 
a  composite  theory  from  which  all  previous  methods  can  be  derived  as  variations 
to  the  basic  results. 

In  Chapter  3  we  consider  the  time  scale  decomposition  of  singularly 
perturbed  systems.  For  this  problem  (1.1)  takes  the  form 
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where  x^SR  ,  «2SR  »  an<*  0<y<<l.  In  this  case,  the  singular  perturbation 
parameter  p  explicitly  defines  the  presence  of  a  two-time-scale  phenomena. 

Using  this  parameter,  two-time-scale  asymptotic  expansions  [8,13]  of  the 
state  vector  can  be  used  to  obtain  a  reduced  order  solution  to  (1.4).  n  this 

chapter  we  show  the  relationship  between  the  "spectrum  separating"  it  tive 

techniques  of  Chapter  2  and  matched  asymptotic  expansions  used  for  si  iarly 
perturbed  models.  The  results  of  these  two  chapters  establish  a  unifu.  .  theory 
of  time  scale  decompositions  in  linear  time-invariant  systems.  In  the  remainder 
of  this  thesis  we  consider  a  class  of  stochastic  systems  in  which  the  results 
from  both  singular  perturbations  and  aggregation  techniques  are  applied. 

Markov  chains  and  Markovian  decision  processes  have  long  been  used 
in  the  analysis  and  design  of  stochastic  systems  [14,15].  Some  of  the 
application  areas  of  Markov  modeling  included  the  following: 

i.  Numerical  solutions  to  stochastic  control  problems  [16] 

ii.  Inventory  theory  [17] 

iii.  Queuing  theory  [18]. 

Markovian  decision  processes  can  be  traced  back  to  Bellman's  development  of 
Dynamic  Programming  [19,20]  where  many  of  the  Markov  chain  control  problems 
were  formulated.  Since  this  time,  many  design  algorithms  for  a  variety  of 
controlled  Markov  chain  problems  have  been  developed.  The  theoretical  richness 
of  this  area  has  kept  it  popular  with  researchers.  However,  the  practical 
usefulness  of  Markov  models  and  Markovian  decision  processes  has  been 
severely  limited  due  to  the  extremely  large  dimension  of  most  Markov  chains. 

The  computational  burden  of  these  problems  has  discouraged  systems  engineers 
from  using  Markov  chains  for  modeling  purposes.  Recent  applications  in 
queueing  theory  [21]  and  the  management  of  hydrodams  [22]  have  exhibited 


Markov  chain  models  with  a  "weakly  coupled"  structure  suitable  for  perturba- 
tional  analysis.  In  recent  years,  authors  [23-25]  have  used  this  structure 
to  construct  reduced  order  aggregate  models  for  large  Markov  chains.  Since 
the  aggregate  models  were  developed  using  an  intuitive  understanding  of  the 
process  dynamics,  the  results  were  limited  and  a  more  complete  theory  needed 
in  order  for  this  concept  to  become  useful  for  analysis  and  design.  Therefore 
in  Chapter  4  we  show  how  a  weakly  coupled  Markov  chain  can  be  transformed  into 
a  singularly  perturbed  system  model.  Then,  the  decomposition  techniques  of 
Chapters  2  and  3  can  readily  be  applied. 

In  Chapters  5  and  6  we  consider  the  problem  of  optimally  controlling 
Markov  chains  with  respect  to  certain  performance  measurements.  In  general, 
these  problems  are  computationally  horrendous.  However,  by  applying  the 
results  of  Chapter  4,  a  near  optimal  policy  can  be  found  using  a  simplified 
decentralized  optimization  algorithm.  In  Chapter  5  we  consider  the  discounted 
cost  problem  [15]  and  in  Chapter  6  the  average  cost  per  stage  problem  [15]. 

Finally,  in  Chapter  7  we  draw  conclusions  and  point  to  a  number  of 
research  directions  to  which  this  thesis  leads. 
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CHAPTER  2 

EIGENSTRUCTURE  DECOMPOSITION  OF  TIME  SCALES 
IN  LINEAR-TIME  INVARIANT  SYSTEMS 


2.1.  Introduction 

Many  control  theory  concepts  are  valid  for  any  system  order, 
however,  their  actual  use  is  limited  to  low  order  models.  Large  scale 
systems  result  not  only  in  a  formidable  amount  of  computation,  but  also  in 
ill-conditioned  Initial  and  two  point  boundary  value  problems.  The  inter¬ 
action  of  fast  and  slow  phenomena  in  high-order  systems  results  in  stiff 
numerical  problems  which  require  expensive  integration  routines.  The  singular 
perturbations  approach  to  decomposing  fast  and  slow  phenomena  involves  using 
a  time-scale  separation  technique.  In  this  case  a  reduced  order  "steady 
state"  and  "boundary  layer"  solutions  are  obtained  from  a  high  order  model. 
Control  designs  and  simulations  for  the  high  order  model  are  then  carried  out 
on  the  reduced  order  subsystems. 

It  is  the  purpose  of  this  chapter  to  unify  and  extend  the  results 
of  previous  authors  [6-12]  and  attempt  to  provide  a  sense  of  completeness  to 
the  theory  of  time-scale  separation  in  linear  systems.  Given  the  linear  time 
Invariant  homogeneous  system 


(2.1) 
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our  purpose  is  to  transform  it  into  either 


y 

’a 

B  ‘ 

V 

. 

30 

n 

0 

D 

n 

30 

(2.2) 


6 


such  Chat 


or 


•  • 

■ 

m 

ft  m 

k 

A* 

00 

0 

5 

• 

z 

c 

D* 

z 

•  • 

*• 

00  m 

•  m 

min  |  oCDj  j  >  max  j  a  (A")  | 
mln|o(D*)|  >maxjo(A*)|. 


(2.3) 


(2.4) 

(2.5) 


Any  system  (2.1)  which  has  this  property  Is  said  to  satisfy  the  two-time- 
scale  property  for  dimensions  n  and  m. 

In  Section  2.2  earlier  methods  of  time-scale  decomposition  used  to 
transform  (2.1)  into  (2.2)  and  (2.3)  are  briefly  reviewed.  Then  using  a 
modified  form  of  dominant  left  and  right  eigenspace  power  Iterations  the 
equivalence  of  these  past  iterative  schemes  is  established.  This  enables  us 
to  define  unified  conditions  for  convergence  of  algorithms  as  well  as  the 
convergence  rates . 

Section  2.3  completes  the  block  diagonalizatlons  of  (2.2)  and  (2.3) 
and  identifies  "fast"  and  "slow"  components  of  our  original  state  vectors. 

The  explicit  invertibility  of  the  transformation  matrices  is  shown.  This 
becomes  very  important  in  later  chapters. 

In  Section  2.4  we  consider  the  problem  or  properly  ordering  the 
state  variables.  A  recently  developed  "grouping  algorithm"  [34]  used  for 
power  networks  is  shown  to  be  directly  applicable. 

Finally,  in  Section  2.5,  an  example  is  given  of  the  eigenspace 
decompositions . 


7 


2.2.  Left  and  Right  Elgenapace  Decomposition 

In  Che  first  pare  of  this  section  we  briefly  review  two  well  known 
iterative  methods  for  transforming  (2.1)  into  (2.2)  or  (2.3) . _ We  then  intro¬ 

duce  the  dominant  left  and  right  eigenspace  power  iterations.  Using  modified 
versions  of  these  iterations,  it  is  possible  to  show  the  equivalence  of  the 
earlier  methods.  Less  conservative  conditions  for  the  convergence  of 
algorithms  as  well  as  convergence  rates  are  also  obtained. 

i)  Quasi  steady  state  method  [6] 

This  method  was  motivated  by  singularly  perturbed  models.  Assume 
that  the  states  are  ordered  such  that  D  1  exists.  This  assumption  is 
standard  in  studies  of  singularly  perturbed  systems.  However,  as  will  be 
presented  later,  this  assumption  can  be  justified. 

If  the  eigenvalues  of  D  are  such  that  the  real  parts  are  large  and 
negative,  then  the  homogeneous  solution  of  z  converges  to  a  steady-state 
rapidly.  If  this  convergence  is  assumed  to  be  instantaneous,  then  z»0  and 
this  quasi  steady  state  assumption  yields 

z  ■  -D  ^Cy  .  (2.6) 

s  s 

Next  we  try  to  remove  this  slow  part  of  z  by  introducing 

■  z  +  D  ^Cx  (2.7) 


which  transforms  (2.1)  into 
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Repeating  steps  (2.6)  and  (2.7)  k  times  results  in  the  following 
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where  the  subsystem  matrices  are  defined  as 

\  -  \-l  -  B\-l  Vl  Ao'A 

c*  -  Dk-Ick-A  co'c 


(2.10) 


(2.11) 


(2.12) 

(2.13) 


Experimental  results,  motivated  by  singular  perturbations,  have 
converged  to  the  form  (2.2)  satisfying  spectral  property  (2.4). 

The  dual  to  this  procedure  involves  removing  the  fast  parts  of  the 
y  states.  Such  a  procedure  would  transform  (2.1)  into  (2.3)  satisfying 
condition  (2.5).  [12]  proposed  this  dual  procedure  which  led  to  matrix 
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(2.i4: 

(2.i5: 

(2.16) 


recursions 
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Again,  while  experimental  results  have  proved  successful,  conditions  -  for- .. 
convergence  as  well  as  convergence  rates  are  unavailable. 

ii)  Algebraic  Riccati  equation  method  [10] 

In  [10,  11,  26,27],  the  transformation  of  the  form 

n  «  z  +  Py  (2.17) 

is  proposed  in  an  attempt  to  transform  (2.1)  into  (2.2).  By  applying  (2.17) 
to  (2.1),  we  obtain 


(2.18) 


The  problem  is  to  find  the  solution  P  to  the  Riccati  type  equation 

R(P)  -  C-DP  +  PA-PBP  -  0  (2.19) 

such  that  A-BP  and  D+PB  have  the  spectral  properties  (2.4).  Such  spectrum 
dependent  solutions  have  been  referred  to  as  "dichotomic"  [28].  We  will 
throughout  the  rest  of  this  thesis  continue  to  refer  to  this  solution  by 
this  label. 

Earlier  work  by  [10]  and  more  recent  work  by  [9]  have  resulted  in 
the  following  iterative  recursion  formula  for  obtaining  the  dichotomic 
solution  to  (2.19) 

Pk+1  "  Pk  +  (D  +  PkB)-1-R(Pk)  (2.20) 

P  -  D_1C 
o 

which  gives  the  subsystem  matrices  in  (2.2)  at  each  k  as 


10 


A- BP, 


Ao"A 


(2.21) 


Dk  "  D  +  Pk-1B 


Do-D 


(2.22) 


Ck  "  R<Pk-l>  ■  C'DPk-l +  Pk-1A ‘  Pk-lBPk-l  Co"C* 


(2.23) 


Likewise ,  [11,35]  have  proposed  the  dual  to  the  Riccatl  method  via  the  trans¬ 
formation 


5  -  y  -  Pz 


(2.24) 


which  transforms  (2.2)  Into 


5  A-PC  AP  +  B-PCP-PD  5 


D+CP 


(2.25) 


The  problem  is  again  to  obtain  the  dichotomic  solution  P  to  the  Riccatl  type 
equation 


S(P)  -  AP  +  B  -  PCP  -  PD  -  0 


(2.26) 


such  that  A-PC  and  D+CP  have  spectral  properties  (1.5).  From  [11,36]  the 
following  iterative  scheme  was  derived  for  obtaining  the  dichotomic  solution 
to  (2.26) 


W*k  +  s(V<D+cV’1 


(2.27) 


P  -  BD 
o 


(2.28) 


which  leads  to  the  matrix  equations  for  (2.3)  as 


Ak"  A"Pk-lC 


A  ■  A 
o 


(2.29) 


Dk  "  D  +  CPk-l 


D-D 

o 


(2.30) 


Bk  ■  s<iVi>  ■  Bk-i  -  pk-i° + "u-i  -  VB- 


(2.31) 


u 


While  some  convergence  results  are  available  [9,10]  for  the  matrix  recursion 

(2.27) ,  they  are  either  conservative  or  limited  to  solving  only  (2.26).  We 
now  unify  and  extend  these  results  by  showing  that  these  iterative  techniques 
are  equivalent  to  either  dominant  right  or  left  eigenspace  power  iterations. 

We  now  give  a  lemma  which  establishes  a  convergence  criterion  for 

(2.27)  based  on  dominant  left  eigenspace  power  iterations.  In  the  process, 
we  shall  show  that  (2.11)-(2 .13)  and  (2.21)-(2 .23)  are  equivalent  at  every 
iterate. 

Lemma  2.1:  Given  (2.1),  if  the  spectrum  is  concentrated  in  two  groups  of  m 
such  n  eigenvalues  such  that 


max) X^J  <  min |  Ajl.  (2.32) 

i-l,n  j»l,m 

Then  under  mild  restrictions  [29]  on  the  initial  iterate  P  ,  (2.27)  will 

o 

k 

converge  to  the  dichotomic  solution  of  (2.26)  at  a  convergence  rate  of  e  , 
where 

max|x  |  ie  l,n 

e  **  — ,  .  (2.33) 

minj  X  |  j  e  l,m 

Proof;  The  well  known  power  iteration  method  [30,31]  for  computing  a  m- 
dimensional  basis  for  the  dominant  left  eigenspace  of  (2.1)  is  of  the  form 

[Mk  V  -  Wi 


A  B 

C  D 


(2.34) 


is  a  nonsingular  mxm  scaling  matrix  used,  for  example,  to  keep 
the  rows  of  [M^  strongly  independent  and  the  individual  components 
within  a  practical  range  of  computation  [31].  Many  methods  have  been  proposed 


for  selecting  the  sequence  of  R^'s  and  the  Interested  reader  is  referred  to 
[29]  and  [31].  The  analytical  convergence  of  (2.34),  however,  Is  indepen¬ 
dent  of  R.  .  Thus,  under  condition  (2.32)  and  mild  conditions  on  [M  N  ] ,  it  is 

K  O  O 

known  that  (2.34)  converges  to  the  dominant  m-dimensional  left  elgenspace  of 

(2.1). 

Expressing  the  common  iterates  as 

\  •  \-iA  +  Vic  Vc  <2-35> 

h1  ■  <D + v1-”'1-  <2  • 3B> 

We  can  form  the  product 

<D+\-iMk-i,)*1(c+\-iMk-iA>-  <2-37> 

Letting  gives  (2.37)  as 

Pk  "  (D  +  Pk-1B)"1(C  +  Pk-1A)  P0“D~lc  (2.38) 

which  is  equivalent  to  (2.27)  VkiO. 

To  show  the  solution  is  dichotomic,  P  is  of  the  form 

P  -  N-1M.  (2.39) 

Without  loss  of  generality  we  can  take  [M  N]  ■  [V^  where 

[V  V2]  are  the  m  left  eigenvectors  corresponding  to  the  m  dominant  eigen¬ 


values  ,  thus 


>T1  V 


A  B 


C  D 


0  J, 


IV1  V 


(2.40) 


VjA  +  V2C 


(2.41) 


V^B  +  V2D 


(2.42) 


However ,  from  (1.16) 


C  +  V2lyiA  “  DV2\  +  V2lviBV2lvi 


which  leaves  both  (2.41)  and  (2.42)  In  the  form 


v2(D+v-\b)v-1  -  J2 


(2.43) 


verifying  the  desired  spectral  decomposition  we  can  now  show  the  equivalence 
of  (2.11)-(2.13)  and  (2.21)-(2.23) . 

Corollary  2.2:  The  matrix  iterations  (2.11)-(2.13)  and  (2.21)-(2.23)  are 
equivalent  at  every  k.  Thus,  the  just  provided  convergence  properties  of 
(2.21)-(2.23)  are  propagated  to  (2.11)-(2.13) . 

Proof:  Substitution  of  (2.20)  into  (2.21)-(2.23)  results  in 


A-BPk_l  -  A-BPk_2-B(D  +  Pk_2B)  ^(P^) 


(2.44) 


D  +  Pk-1B  "  D  +  Pk-2B+ (D  +  Pk-2B)  *R(Pk-2),B 


(2.45) 


Ck-R(pk.l)  -  (D  +  Pk_2B)  .R(Pk_2).(A-BPk-1) 


(2.46) 


Letting 


\  "  A'BPk-l 


(2.47) 


Yk"  D+?k-lB 


(2. 43) 


ffk  '  Ck 


(2.49) 
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(2.47)-(2.49)  become 


\  '  Vi  -  "ViVi 


a  ■  A 
o 


(2.50) 


Yk  "  Yk-1 + Yk-l°k-lB 


Yo-D 


(2.51) 


ak  "  Yk-l°k-l\ 


c  -  C 
o 


(2.52) 


which  are  equivalent  to  (2.11)-(2.13)  Vk>0. 

We  now  cite  a  lemma  [9]  which  is  the  dual  to  Lemma  2.1  and 
establishes  the  conditions  of  convergence  of  (2.27)  based  on  dominant  right 
eigenspace  power  iterations. 

Lemma  2.3:  If  the  spectrum  of  (2.1)  satisfies  (2.32),  then  under  mild 
* 

restrictions  on  Pq  [9],  (2.27)  will  converge  to  the  dichotomic  solution  at  a 
convergence  rate  of  e  where  e  is  defined  in  (2.33). 

Proof:  The  well  known  power  iteration  method  [29,30]  for  computing  an  M 
dimensional  basis  for  the  dominant  right  eigenspace  of  (2.1)  is  of  the  form 


D  U 


M  -  B 
o 

N  -  D 
o 


(2.53) 


where  the  scaling  matrix  serves  the  same  purpose  as  explained  in  Lemma  2.1. 
Thus,  under  mild  restrictions  on  ^N° j  »  is  is  well  known  that  (2.53)  con- 


well  known  that  (2.53)  con¬ 


verges  to  the  dominant  m-dimensional  right  eigenspace  of  (2.1) 
Expressing  (2.53)  as 


*k  ’  ^-1  +  BNk-l 


M  =  B 
o 


K1  •  C<D+ati*£i)'1  so1-D‘l 
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fora  che  produce 


Vk1  ■  (B+AMk-A-i)(I>+CHk-iNk-i)'1' 


(2.54) 


Leccing 


w 

K  k  ' 


(B  +  AP^) (D  +  CPk>-1 


(2.55) 


(2.56) 


which  is  equivalent  to  (2.27)  Vk>0.  Proving  P  is  dichotomic  is  carried  out 

as  in  Lemma  2.1  and  can  be  seen  in  (9,36]. 


Finally,  we  can  show  the  equivalence  of  (2.14)-(2.16)  and  (2.29)- 


(2.31). 


Corollary  2.4:  The  matrix  recursions  (2.14)-(2.16)  and  (2.29)-(2. 31)  are 
equivalent  at  every  k,  thus,  the  convergence  properties  of  (2.55)  are  pro¬ 
pagated  to  (2.14)-(2.16) . 

Proof ;  Substitution  of  (2.27)  into  (2.29)-(2.31)  gives 


A-  (HC  -  A-  Pk_2C  -  S(Pt_2>  <D+  CPk_2>  C 


(2.57) 


D+CPk.x  -  D+CPk.2  +  CS(Pk.2)(D+CPk.2) 


(2.58) 


Bk  -  S(Pk.1)  -  (A-P^OSCP^XD+CP^)' 


(2.59) 


Letting 


ak  "  A_  ?k-lC 


(2.60) 


\“D  +  CPk-l 


(2.61) 


a  *  b 
Ck  “k 


(2.62) 


\  ■  Vr!k-iVic 


a  »  A 
o 


(2. 


1 

I 

I 

3 


9 


« 

r 

.■ 


« 


I 


gives 


Yk  “  Yk-1 +  CBk-lYk-l  Y0  "  D 


6k  ’  Wi'm 


g  -B 

o 


(2.64 

(2.65 


which  are  equivalent  to  (2.14)-(2.16) . 

Thus,  the  dominant  left  and  right  eigenspace  iterations  of  (2.38) 
and  (2.55)  can  be  used  to  transform  (2.1)  into  (2.2)  and  (2.3)  respectively, 
requiring  only  that  there  exists  a  corresponding  separation  in  the  original 
spectrum.  In  the  next  section  (2.2)  and  (2.3)  are  block  diagonalized  so  as 
to  isolate  the  fast  and  slow  dynamics  of  (2.1). 


2.3.  Block  Diagonalization  and  Identification  of  Fast  and  Slow  State 
Vector  Components 

Once  we  have  transformed  (2.1)  into  (2.2)  or  (2.3)  satisfying 
conditions  (2.4)  or  (2.5)  respectively,  block  diagonalization  is  always 
possible. 

Consider  form  (2.2),  and  the  transformation  (2.17)  used  to  obtain 
this  form.  The  dichotomic  solution  matrix  P  is  of  the  form 

p  -  if  Hi 

where  the  rows  of  [M  N]  span  the  dominant  left  eigenspace  of  (2.1).  Thus, 
the  exact  form  of  (2.2)  is 


y" 

'  A-BP 

B 

y 

9 

• 

w 

. 

.  0 

D  +  PB  . 

w 

(2.66) 
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Now,  let  x*y-Qw.  This  leaves  (2.55)  in  the  form 


l 


w 


A-BP  (A-BP)Q-Q(W-PB) +B 
0  D  +  PB 


(2.67) 


Thus,  we  seek  Q  to  satisfy  the  Lyapunov  type  equation 

(A-BP)Q- Q(D  +  PB) +  B  -  0  (2.68) 

Such  a  Q  will  always  exist  since 

cr(A  -  BP)  n  <j(D  +  PB)  3  ip  (2.69) 

(2.68)  may  be  solved  algebraically  [32]  or  iteratively  [10] .  One  obvious 
iterative  scheme  is  to  apply  the  dominant  right  eigenspace  iterations  used 
for  transforming  (2.1)  into  (2.3).  Since  (2.66)  satisfies  (2.32)  convergence 
is  assured.  Such  an  iteration  would  take  the  form 

Qk+1  *  (B  +  (A  -  BP)Qk) ' (D  +  PS)  “ 1 

Q  *  0 
o 

whichever  method  used,  the  resulting  system  is  of  the  form 


r. ' 

r* 

1 

3 

L*. 

0 

D  +  PB 


w 


(2.70) 


and  the  composite  transformation  is 

■y] 


i 


M 

I  __ 

Q  "  |x 

-p 

i-PQ  ,  j* 

(2.71) 


which  possesses  the  explicit  inverse 


(2 


Thus  we  seek  Q  to  satisfy  the  Lyapunov  type  equation 

.  * 

Q(A  -  PC)  -  (D  +  CP)Q  +  C«0 


such  a  Q  will  always  exist  since 

ct(A  -  PC)n  <r(D  +  CP)  -0 


(2  ' 


Again,  (2.78)  may  be  solved  iteratively  or  algebraically.  Applying  the 
dominant  left  eigenspace  iterations  to  (2*76)  convergence  is  assured, 
iteration  takes  the  form 


Qk+1  ■  (D  +  CP)"1  •  (C+Qk‘(A  -  PC))  Qq  -D_1C 
The  resulting  system  is  of  the  form 


• 

X 

“a  -  PC  0 

”y~ 

s 

• 

w 

0  D  +  CP_ 

z 

and  the  composite  transformation  is 


• 

X 

M  1 

1 

»XJ> 

_ I 

y 

s 

w 

Q  I  -  QP_ 

z 

with  also  possesses  the  explicit  inverse 


y 

z 


i  -  PQ 
-Q 


P 

I 


X 

w 


0 


C 


(: 


0 


Thus,  we  have  again  decomposed  the  y  and  z  state  vectors  into 
"fast"  and  "slow"  components.  Namely 


y  =  (I  -  PQ)x  +  Pw  =  y  +y 

'  slow  ytast 


z  =  -Qx  +  w  =  z  ,  +  z  . 

^  sLow  ra.sE 
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where 

x(t)  *e^A  '^x  x  «y  -  Pz 

o  o  o  o 

W(t)-e<D  +  C\  „o-Qyo+(I-Qp)2o 

The  relationships  between  various  fast  and  slow  components  of  (2.1)  will  be 
an  important  topic  of  the  next  chapter. 


2.4.  Ordering  of  State  Variables 

In  the  results  of  the  previous  two  sections  it  was  always  assumed 
that  there  existed  an  ordering  rf  states  such  that  D  ^  existed  as  well  as 
N  1  in  (2.39)  and  (2.54).  While  [37]  guarantees  the  existence  of  such  an 
ordering,  until  recently  no  jystematic  algorithm .was  available  to  achieve 
this  ordering  of  states.  [34]  has  developed  a  "grouping"  algorithm  for 
the  area  decomposition  of  power  networks.  By  applying  this  algorithm  to 
the  left  or  right  dominant  eigenspace  of  (2.1)  the  necessary  ordering  of 
states  can  be  obtained. 

Let  F  be  the  system  matrix  in  (2.1)  and  let  V  correspond  to  the 
matrix  of  right  eigenvectors.  Thus,  F*VAV  ^  where  A  is  the  eigenvalue 
matrix.  If  x»Fx,  and  we  let  x*Vy,  then 

x(t)  -  VeUy(0).  (2.86) 

Partition  V  as  V*  [V^  ;  Vg]  where  and  are  the  right  eigenvectors 

corresponding  to  the  fast  and  slow  spectrum  respectively.  Each  row  i  of 

V.  "weighs"  the  contribution  of  the  fast  modes  to  state  x..  If  V.  has  m 
r  it 

columns,  we  want  the  m  rows  of  that  are  the  most  linearly  independent 
to  correspond  to  our  fast  states.  This  can  be  done  by  performing  a  Gaussian 
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^  r’. 


3  ■ 


►  — 


►  *' 


B 


K 


elimination  with  complete  pivoting  [34]  on  the  rows  of  V^.  If  we  call 
the  m  most  linearly  independent  rows  of  and  permute  the  states  such  that 


(2.87) 


Then  since  FV^-V^A^,  we  obtain 


D  +  C(V^(vjp-1) 


2  2-1 
V  A  (V  ) 

f  V V 


(2.88) 


where  A.  is  the  dominant  eigenvalue  matrix  of  dimension  m.  By  this 
2 

construction,  * N  in  (2.54)  and  the  proof  of  dominant  right  eigenspace 
iterations  follows . 


Now  partition  V  ^  as 


W 

- 

lWf 


where  Wf  and  Wg  are  the  left  eigen¬ 


vectors  corresponding  to  the  fast  and  slow  spectrums  of  F  respectively. 

Again,  since  has  m  rows,  we  want  the  m  columns  of  that  are  the  most 

linearly  independent  to  correspond  to  our  fast  states.  This  again  can  be 

2 

done  using  Gaussian  elimination.  If  we  call  the  m  most  linearly  indepen¬ 
dent  columns  of  and  permute  the  states  such  that 


wf  -  iwj  ;•  W2f]. 


(2.89) 


Then  since  W^F =  A^W^,  we  obtain 


2-11  2-1  2 
D  +  (W.)  Vb  =  (Wp  AfWp 


(2.90) 


,-l 


hus,  N  will  exist  in  (2.37)  and  the  proof  for  the  dominant  left  eigenspace 


iterations  follows. 
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Thus,  depending  on  whether  you  want  to  transform  (2.1)  into  (2.2) 
or  (2.3),  the  ordering  of  states  can  possibly  be  different  using  the  above 
methods.  However,  in  general,  the  state  orderings  resulting  from  these 
methods  are  not  the  only  such  orderings  which  possess  the  desired  properties. 
In  many  cases  a  given  ordering  of  states  will  satisfy  the  conditions  necessary 
for  the  application  of  both  left  and  right  eigenspace  iteration.  The 
application  of  the  "grouping"  algorithm  merely  assures  us  that  there  exists 
orderings  of  states  which  satisfy  the  conditions  assumed  in  the  lemmas.  In 
many  cases,  such  as  singularly  perturbed  models,  the  proper  ordering  of 
states  can  be  done  by  inspection. 


2.5.  Example 

-  Decomposition  of  States 

into 

Fast  and 

Slow 

Components 

In 

given  as 

[39] ,  the  8th 

order  model 

of  an 

isolated 

mixed 

power  system  is 

-.2 

0 

0 

0 

0 

0 

0 

0 

4.75 

-5 

0 

0 

(3 

0 

0 

0 

0 

.16667 

-.16667 

0 

0 

0 

0 

0 

x  * 

0 

0 

2  -2 

0 

0 

0 

0 

0 

-.08 

-.07467 

112 

-3.9944 

10 

-.92778 

-9.1 

0 

0 

0 

0 

9 

-.5 

0 

0 

0 

0 

0 

0 

0 

0 

-1.39 

-.278 

0 

.01 

.0093 

014 

-.06319 

0 

.11597 

- .  1 12  3  6  L_ 

12.91) 


r 


using  a  permutation  of  x  »  Py 


gives 


P- 

<el>e3’V 

e8*W 

e7’V 

2 

0 

0 

0 

0 

0 

0 

0 

0 

-.167 

0 

0 

0 

0 

0 

.167 

0 

0 

-.5 

0 

.2 

0 

0 

0 

0 

.009 

0 

-.112 

-.063 

.014 

.116 

01 

0 

-.075 

10.00 

-9.101 

-3.994  - 

.112 

-.927 

-.08 

0 

2.00 

0 

0 

0 

-2.00 

0 

0 

0 

0 

0 

-.277 

1.319 

0 

-1.386 

0 

.75 

0 

0 

0 

0 

0 

0 

-5.00 

The  eigenvalues  of  (2.92)  are 


(2.92) 


eigenspace  iterations  we  obtain 

-0.20000  0.0' 

A-P3-  °* 15834  -°-1' 

-0.00312  -0.01 

0.00877  o.o: 


-1.3884147 

+  0. 0000000 J 

-0.1291288 

+  0.2124795J 

-0.1291288 

0.2124795J 

-4.3489879 

+  0.0000000J 

(2 

-2.0000000 

+  O.OOOOOOOJ 

-0.1666700 

O.OOOOOOOJ 

-5.0000000 

+  O.OOOOOOOJ 

-0.2000000 

+  O.OOOOOOOJ 

we  obtain  an 

s  of  .1792.  Using  the  dominant 

left 

obtain 

0.00000 

0.00000  0.00000  ■ 

-0.16667 

0.00000  0.00000  ! 

(2 

-0.00766 

-0.08981  -0.36571  ; 

i 

0.02153 

0.09635  -0.22145  J 

which  has  eigenvalues 


-0.15563  + 

0.175 79J 

-0.15563  - 

0.17579J 

-0.16667  + 

0.00000J 

-0.20000  + 

O.OOOOOJ 

-4.52014 

-0.08640 

-0.71572 

-0.05533 

D  +  BP  « 

0.00000 

-2.00000 

0.00000 

-0.16667 

o 

0.80733 

0.02712 

-1.16426 

0.02543 

_  0.00000 

0.00000 

0.00000 

-5.00000 

which  has  eigenvalues 

-4.33808  + 

O.OOOOOJ 

-1.34632  + 

O.OOOOOJ 

-2.00000  + 

O.OOOOOJ 

-5.00000  + 

O.OOOOOJ 

using  the  dominant  right 

eigenspace  iterations  we 

get 

-0.20000 

0.00000 

0.00000 

0.00000 

A-C?  - 

0.15834 

-0.16667 

0.00000 

0.00000 

o 

-0.00312 

-0.00766 

-0.08981 

-0.36571 

_  0.00877 

0.02153 

0.09635 

-0.22145 _ 

which  has  eigenvalues 

-0.15563  + 

0.17579J 

-0.15563  - 

0.17579J 

-0.16667  + 

0 . OOOOOJ 

-0.20000  + 

O.OOOOOJ 

-4.31691 

-0.03023 

0.04757 

-0.05415 

D  +  P  C  » 

0.00000 

-2.00000 

0.00000 

-0.06667 

0 

1.32208 

0.00179 

-1.36749 

0.00051 

0.00000 

0.00000 

0.00000 

-5.00000 

(2.95) 


(2.96) 


(2.97) 


(2.98) 


(2.99) 


(2.100) 


Which  has  eigenvalues 


-4.33808  +  O.OOOOOJ 
-1.34632  +  O.OOOOOJ 
-2.00000  +  O.OOOOOJ 
-5.00000  +  O.OOOOOJ 


(2.101) 


To  show  how  this  accuracy  may  be  improved,  after  two  iterations  we  obtain 


-0.20000 

0.16492 

-0.00233 

0.00979 


0.00000 

-0.16667 

-0.00741 

0.02836 


0.00000 

0.00000 

-0.08420 

0.13210 


0.00000 

0.00000 

-0.36234 

-0.17633 


with  eigenvalues 


-0.13027  +  0.2 1388 J 

-0.13027  -  0.2 1388 J 

-0.16667  +  O.OOOOOJ 
-0.20000  +  O.OOOOOJ 


D  +  BPX  - 


-4.52468 

0.00000 

0.76779 

0.00000 


0.08664 

2.00000 

0.02154 

0.00000 


-0.71767 

0.00000 

-1.21045 

0.00000 


with  eigenvalues 


-0.05571 

-0.18172 

0.01334 

-5.00000 


-4.34912  + 

O.OOOOOJ 

-1.38601  f 

O.OOOOOJ 

-2.00000  + 

O.OOOOOJ 

-5.00000  + 

O.OOOOOJ 

-0.20000 

0.00000 

0.00000 

0.00000 

0.16379 

-0.16667 

0.00000 

0.00000 

!  -0.00237 

| 

-0.00609 

0.00444 

-0.45814 

0.00821 

0.02283 

0.13946 

-0.26497 

(2.102) 


(2.103) 


(2.104) 


(2.105) 


(2.106) 


with  eigenvalues 


-0.13027 

+ 

0.21388J 

-0.13027 

- 

0.21388J 

-0.16667 

.  + 

0. 00000 J 

-0.20000 

+ 

0.0000QJ 

-4.37192 

-0.03420 

-0.05145 

-0.05565 

D  +  PjC  - 

0.00000 

1.32327 

-2.00000 

0.00202 

0.00000 

-1.36221 

-0.06896 

0.00048 

_  0.00000 

0.00000 

0.00000 

-5.00000 

with  eigenvalues 

-4.34912 

+ 

O.OOOOOJ 

-1.38601 

+ 

O.OOOOOJ 

-2.00000 

O.OOOOOJ 

-5.00000 

+ 

O.OOOOOJ 

(2.107) 


(2.108) 


(2.109) 


We  now  give  graph  of  selected  states  along  with  their  fast  and 
slow  components  using  (1.85),  (1.86),  (1.96)  and  (1.97)  for  both  the  left 
and  right  eigenspace  decompositions.  The  plots  will  be  based  on  the  PQ  and 
iterates.  On  the  graphs  of  the  individual  components,  the  following 
legend  will  be  in  effect 

ACTUAL  STATE  - 

SLOW  COMPONENT . 

FAST  COMPONENT  . 


On  the  graphs  of  the  actual  state  versus  the  approximated  state 

ACTUAL  STATE  - 

APPROXIMATED  STATE  . 


The  plots  appear  on  the  next  several  pages.  The  system  is  perturbed 


and  added  components 


I’igure  1.7.  State  6  and  compo 
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TIMECSECONDS} 

figure  1.8.  State  6  and  added  components  using  right  eigenspace  iterations. 
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CHAPTER  3 


ASYMPTOTIC  SERIES  DECOMPOSITION  OF  TIME-SCALES 
IN  LINEAR  TIME-INVARIANT  SYSTEMS 


3.1.  Introduction 

When  small  parameters  are  present  in  differential  equations  defining 
initial  or  boundary  value  problems,  one  of  the  popular  methods  used 
[30]  is  to  obtain  an  asymptotic  power  series  expansion  of  the  solution.  Such 
techniques  have  been  well  documented  and  can  produce  approximate  solutions 
to  problems  where  otherwise  explicit  analytic  solutions  are  impossible  or 
exact  numerical  solutions  are  computationally  not  practical.  Such  systems 
are  of  the  form 

x  -  f(x,t,e)  x(0)  *  xq  (3.1) 

and  a  solution  is  sought  of  the  form 

x(t)=»x°(t)+  £X^(t)  + .  (3.2) 

When  such  an  expansion  converges  uniformly  in  t  as  e-*>  0  we  have  a  regular 
perturbation  problem  [ 8  ] .  If  there  is  a  region  of  nonuniformity,  usually 
at  one  of  the  boundaries ,  we  have  a  singular  perturbation  problem.  In  most 
cases,  the  dynamics  of  the  solution  vector  within  this  region  of  nonuniform 
convergence  involve  fast  transients  or  the  so  called  ,rboundary  layer 
phenomena."  Thus,  such  singularly  perturbed  systems  [8  ]  are  said  to  possess 
an  inherent  two- time- scale  property  characterized  by  a  steady  state  or  "outer 
solution"  which  is  defined  by  the  regions  of  uniform  convergence  of  (3.2), 
and  the  boundary  layer  or  "inner  solution"  where  a  stretched  time  variable  is 
usually  introduced  in  order  to  achieve  asymptotic  convergence  on  the  total  tir 
interval.  In  the  linear  case  such  systems  take  the  form 


(3.3) 


y  -  Ay  +  Bz  y(0)«yQ 
H*z  ■  Cy +  Dz  z(0)  -  z 

o 

Much  work  has  been  done  Co  expLoit  che  mulciple  time  scale 
property  of  (3.3)  when  designing  regulators,  pole  placement,  reduced  order 
modeling,  etc.  [3,7,40].  As  a  result,  for  a  system  which  is  known  to 
have  fast  and  slow  phenomena,  the  systems  engineer  is  motivated  to  permute 
the  state  in  order  to  attain  the  above  structure  and  take  advantage  of  these 
decomposition  techniques.  It  is  the  purpose  of  this  chapter  to  use  multiple 
time  scale  asymptotic  expansions  to  obtain  a  "steady  state"  and  "boundary 
layer"  decomposition  in  (3.3)  and  show  how  this  decomposition  compares  to  the 
eigenspace  decompositions  of  Chapter  2. 

% 

In  Section  3.2  we  obtain  power  series  representations  of  our  dichotomic 
transformation  matrices  P  and  P. 

In  Section  3.3  we  derive  important  relationships  between  various 
fundamental  sets  of  solutions  to  (3.3)  and  any  system  satisfying  (2.32). 

These  fundamental  sets  are  based  on  our  reduced  order  subsystem  matrices  and 
the  dichotomic  transformation  matrices  P,  P,  Q,  and  Q. 

In  Section  3.4  we  use  Vasil' eva's  method  of  matched  asymptotic 
expansions  to  obtain  the  "boundary  layer"  and  "steady  state"  components  of 
the  solution  vectors  y(t)  and  z(t).  It  is  shown  that  this  decomposition  is 
equivalent  to  the  eigenspace  decompositions  of  Chapter  2  by  using  one  of  the 
fundamental  solution  sets  established  in  Section  3.3. 

Section  3.5  discusses  some  computational  simplifications  to  the 
dominant  left  and  right  eigenspace  iterations  based  on  system  (3.3).  The 


simplifications  involve  eliminating  the  necessity  to  take  an  inverse  at 
every  iteration. 

Finally,  in  Section  3.6  we  discuss  an  important  application  of 
these  techniques,  namely,  reduced  order  control  law  design. 


3.2.  Series  Solutions  to  Riccati  Iterations 

For  proper  spectral  decomposition  and  dimensions  m  and  n  in  (2.D, 
it  was  shown  in  Section  2.2  of  Chapter  2,  that  the  following  matrix 
recursion  equation  for  finding  the  dominant  left  eigenspace  of  (2.1) 

Vi  -pk  -  <°+V»'1-  <DVc+pkBVpkA)  u-4) 

P  »  D_1C 
o 

will  converge  to  the  dichotomic  solution  of 

R(P)  =  DP  -  C  +PBP  -  PA  »  0  .  (3.5) 

When  (2.1)  is  in  the  form  of  a  singularly  perturbed  model,  (3.4)  and  (3.5) 
become 

pk+t  ■  \  ’ (D +,iPkB)‘l  •  <DPk  - c +“pkBPk  -  *pkA)  <3-6) 

and 

R(P)  *DP  -  C  +HPBP  -  4PA  *  0  (3.7) 

respectively.  Approximate  solutions  to  (3.7)  have  been  used  in  [6,7,26]  to 
construct  near  optimal  control  laws  for  singularly  perturbed  systems. 

In  this  section  we  construct  an  asymptotic  approximation  (to  N  terms.) 


of  the  matrix  function  P(r)  as  r- -  0  with  respect  to  the  asymDtotic  sequence 
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&  1 

it-  j.  TJius,  we  seek  a  solution  to  (3.7)  of  the  form 


P(^)  -P0  +  P-PL  +  .. 


(3.8) 


Substituting  (3.8)  into  (3.7)  and  matching  separately  like  powers  of  P-  to  zero 
we  obtain 


DP  -  C  -  0 


(3.9) 


>1  +  PC^P°  -  P°A  -0 


(3.10) 


DPN+.2  pN"l"-W  -  pn-1a  *  0 
j*o 


(3.11) 


Hence,  each  term  in  the  series  (3.8)  is  uniquely  defined  as  follows; 


o  -l 
P  -D  C 


(3.12) 


1  -loo  -1  o 
P  *-D  P  BP  +  D  P  A 


PN  =  -D'1  pN_L"jNPj  +D“1I^’lA 


(3.13) 


the  asymptotic  correctness  of  this  series  is  obvious  and  we  thus  write  (3.8)  in 
the  form 


iN 

p(p.)  mZQ  ^pJ  +  0(^N+L) 


(3.14) 


or  using  standard  notation 


-j  L 


(3.13) 


One  question  we  might  ask  is  how  do  the  iterations  of  (3.6)  relate 


to  the  individual  terms  in  the  series  (3.14).  After  considerable  algebra, 
it  is  possible  to  show  that 
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Jopl  -  PN  ■  <W‘t,+l>  (3-16) 

for  N  •  0,1, . 

This  is  to  be  expected,  since  (3.6)  converges  to  the  dichotomic  s 

solution  of  (3.7)  which,  due  to  the  uniqueness  of  (3.14)  with  respect  to  the 

r  Vc 

asymptotic  sequence  J,  must  have  this  asymptotic  expansion. 

One  area  of  research  in  the  use  of  asymptotic  expansions  that  has 
received  little  attention  is  nonlinear  difference  equations.  If  we  interpret 
(3.6)  with  Pq-d'^C  as  an  initial  value  problem  for  a  matrix  system  of  nonlinear 
difference  equations,  we  can  construct  an  asymptotic  series  solution  to  (3.6) 
of  the  form 

Pk‘Pk  +  W?k  +  **--  ,  (3’L7)  ’ 

Note  that  such  an  expansion  converges  as  H*-*0  uniformly  in  k  and  thus  defines 
a  regular  perturbation  problem  [8]. 

Expressing  (3.6)  in  a  more  convienient  form 

(D  +  uPk8)  Pfc+1  *  C  +  uPjA  .  (3 .  i*) 

We  now  substitute  in  the  series  (3.17)  and  obtain 
[(D+u(p°+^  +  ....)B)(P°+1  +  ,pj+1  +  ....)! 

■  c  +  u(P°+-pJ  +  ..)A. 

Equating  like  powers  of  -  we  obtain  the  so  called  "equations 


of  the  variations"  [13]. 
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DPk+l  *  C 


DPk+l  +  PX-H  '  C 


<*i +  %  <‘1‘J“pd+1  -  C1* 


Since  D  *  exists,  we  can  solve  (3.19)  as 

Ci  -  D'lc 

and  since  our  matching  condition  implies 


P°  a  D 
o 


P  -  0,  N  >  o 
o 


The  solution  to  (3.22)  is  thus, 


o  - 1 

Pk  58  0  C  Vk  2  0 


Likewise,  the  solution  to  (3.20)  is  found  from 


k+1 


-  d'1P°3P°  +  D_1P°A 

k  k+i  k 


However,  since  P,  , , 
k-rl 


■  P,  Vk  £  0 


.+1 


-  D-1?^3P°  +  d'1??A 

k  k  k 


=  constant  matrix 


.nus  , 


=  0 


=  0 


- 1  o  o  - 1  o 
D  P.  3P,  -r  D  p  i 

X  tC  S. 


(3.19) 

(3.20) 


(3.21) 


(3.22) 


(3.23) 


(3.24) 


A 
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Continuing  in  this  manner,  the  equilibrium  solution  for  the  Nth 
variation  is  attained  after  N  iterations  an  is  of  the  form. 


P?"1“'iBP^ 


(3.25) 


which  is  equivalent  to  (3.13)  ?k2N.  While  it  is  clear  that  (3.17)  asympto¬ 
tically  solves  (3.6)  for  any  finite  time  interval,  general  stability  and 
asymptotic  correctness  results  for  discrete  time  perturbation  problems 
remains  an  open  research  area. 

In  a  completely  analogous  manner,  the  series  solution  for  the 
equilibrium  of  the  dominant  right  eigenspace  iterations 


Pk  +  (uB+uAPk  -  PkD  -  PkC  Pk)  •  (D  +  C  Pk) 


(3.26) 


has  the  form 

~o  ^  1 
?  *  P  +  UP 


+ 


(3.27) 
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where 


(3.28) 


(3.29) 


N-l 


F'  =  (-  Z  PN"JCPJ)- d‘L+APN‘1d"1. 
j=l 


(3.30) 


Thus , 


P(u) 


-  5,^uo(,w). 

3-0 


(3.31) 


The  series  solutions  (3.14)  and  (3.31)  will  be  of  importance  in  Section 
3.4  when  we  obtain  a  two- time-scale  asymptotic  series  solution  to  (3.3). 


3.3.  Fundamental  Sets  of  Solutions 

In  this  section  we  develop  some  basic  properties  relating  the 
dichotomic  dominant  left  and  right  eigenspace  transformations  of  Chapter  2. 
The  need  for  these  properties  will  become  apparent  in  the  next  section. 

One  of  the  basic  properties  of  linear  homogeneous  systems  of 
differential  equations  of  the  form 

x  =  Fx  (3.32) 

is  that  a  fundamental  matrix  [41 ’  is  of  the  form 


X(t)  =  e' 


!.  33) 


ror  a  given  initial  value  Problem  x<t 


x  ,  me  i- 


:o  (3.32)  for  t  i  t  is  unicue Lv  aiven  bv 

o  ... 


48 


x(t)  -  X(t)X(t  )-1x  (3.34) 

o  o 

or  when  using  (3.33) 

F(t-t  ) 

x(t)  -  e  xQ  • 

Another  property  of  the  fundamental  matrix  (3.33)  is  that 
given  any  nonsingular  matrix  M, 

Y(t)  -  eFt*  M  (3.35) 

is  also  a  fundamental  matrix. 

For  system  (2.1)  satisfying  condition  (2.32)  We  have 

a  ^ 

established  transformation  matrices  P,  Q ,  P,  and-Q  such  that 


:  A 

B  : 

’  3 

:  I  q 

A-BP  0 

1 

I-QP 

-Q 

(3.36) 

;  c 

E 

i 

, 

-P  I-PQ| 

0  D+PB  ! 

P 

I 

A 

3 

|  ^  ! 

:  I-PQ  P  ' 

■  A -PC  0 

I 

-? 

(3.37) 

C 

E 

-Q  I  _ 

0  D+CP  | 

_  Q 

I-QP 

'"a 

B 

c 

D 

t 

I  Q 

^e(A-BP)t 

o  : 

I-QP  -Q 

e 

-P  I-PQ 

0  e (D+PB) t 

P  I 

(3.33) 

is  a  fundamental  matrix  for  (2.1). 
Thus,  by  (3.35),  so  is 
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'a  b" 


I  Q 


-P  I-PQ  -P  I-PQ 


(A-BP)t 


(D+PB)  t 


(3.39) 


or,  the  columns  of 


|  e (A-BP) t  Qe(D+PB)t  1 


-Pe(A-BP)t  (I-PQ)e(D+PB)t 


(3.40) 


fora  a  fundamental  set  for  the  system  (2.1).  Likewise,  using  a  similar 
argument  for  (3.37),  the  columns  of 


(I-M)e(A"PC)t  Pe(D+CP)t 


X(t)  = 


(3.41) 


(D-H3P)  t 


Now,  by  the  dichotomic  property  of  the  transformations,  there 

exist  nonsingular  matrices  T, ,  TV  ,  T  and  T,  such  that 

12  3  4 


T’1(A-BP)T1 


(3.42) 


T“1(D+PB)T0 


(3.43) 


-1 

:3  (a-pc)T3 


(3.44) 


t4‘(D-K;p)r4 


C  .  -»  3  ) 


where  is  the  dominant  eigenvalue  matrix  and  is  the  eigenvalue 


matrix  consisting  of  the  rest  of  the  spectrum  of  (2.1). 
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Using  (3.42)  ,  (3.43)  ,  (3.44)  ,  and  (3.45)  in  (3.36)  and  (3.37) 


gives 


[  A 

B 

A  A 

(I  PQ)^ 

PT2  !  f^i  0  Jt"1  -t'XP  i 

C 

S 

D 

-QTi 

T-  1 

2_! 

:  i 

0  *2  LT2^  T^a-QW 

|  A 

T3 

or 

Ax  0  *j  j  T“Xa-QP)  -T^Q 

L c 

D  J 

(I-PQ)T4 

1°  *2  ■  _TIlp  "a1. 

by  definition  identifies 

j  A  A 

j  (I-PQ)^ 

. 

t 

?T2 

*3  QI4  !  ' 

j 

1 

1 

i 

-QTX 

f 

T2_ 

-fi3  (I-PQ)T4 

as  eigenvector  matrices  for  (2.1). 

While  the  magnitudes  of  eigenvectors  are  not  unique,  their 
directions  are.  Thus, 


2 


■3 

e 


;  2  3 

also  serve  as  a  fundamental  set  for  (2.1),  or  in  matrix  form 


(3.46) 


(3.47) 


(3.48) 


(3.49) 


T3a  PT2e 


(3.30) 


I 


However,  postmultip lying  (3.50)  by  Che  nonsingular  matrix 


gives 

H  e(A-BP)t  pe(D+CP)t  | 
Y(t)  *  | 

!  .Pe(A-BP)t  e  (D4CP)t  | 

Also,  by  a  similar  argument  on  (3.46)  and  (3.47), 


(3.51) 


(3.52) 


also  qualifies  as  a  fundamantal  matrix  for  (2.1). 

Thus,  in  this  section  we  have  established  the  existence  of 
four  fundamental  matrices  for  (2.1)  based  on  the  dichotomic  transformation 
matrices  P,  Q,  P  and  Q.  This  flexibility  will  prove  valuable  in  the  next 
section  concerning  asymptotic  expansions  of  our  singularly  perturbed 


model  (3.3). 
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3.4.  Solution  by  Asymptotic  Expansion  Using  the  Method  of  Vasil 'eva 

Using  the  results  established  in  the  £irst  two  sections  of  this 
chapter,  we  will  attempt  to  solve  (3.3)  using  asymptotic  expansion  techniques 
In  [13],  the  method  of  matched  asymptotic  expansions  was  proposed 
as  a  method  of  obtaining  asymptotic  solutions  to  the  general  nonlinear 
singularly  perturbed  initial  value  problem 


*  f(z,y,t)  y(0)  *  yQ 

^  dt  “  F(2»y>t>  2(° )  38  2q  • 


(3.54) 


To  use  this  method,  it  is  assumed  that  the  root  z  =  co(y,t) 
of  the  equation 

F(y,z,t)  *  0  (3.55) 

is  stable  in  the  first  approximation  or  specifically,  the  real  parts 
of  the  roots  of  the  characteristic  equation 


DET( 


T,  21 


dz 


-  XI) 


(3.56) 


z*c(  y,t) 

be  negative  in  D,  where  D  is  a  closed  bounded  domain  in  the  variables 
tQ  ^  t  ^  t^,  z  <  K^,  y  <  K,  ,  and  0  ^  j.  <  Under  this  assumption, 
the  method  can  be  applied  to  (3.3)  as  clearly  carried  out  in  [13]. 

In  our  case  (3.54)  reduces  to 


dv 

dt 


=  Ay 


+  3z 


v(0)  =  v 
*  o 


(3.57) 
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and  the  assumption  becomes 


m 


ReC^)  <  0  VX1ea(D)  i-1,  , 

with  this  assumption  satisfied,  we  can  proceed  with  the  asymptotic  solution 
method  proposed  by  Vasil'eva. 

A  solution  to  (3.57)  is  sought  of  the  form 


y  ■  y  + 


Z  *  2  + 


where 


y  *  yQ(t)  +  n  yj_(t)  + 


(3.58) 

(3.59) 

(3.60) 


denotes  a  formal  power  series  in  u>  whose  coefficients  depend  on  t, 
and 

“v  *  ^oy(T)  +  ur^yCr)  + . 


(3.61) 


Denotes  a  formal  power  series  in  u  whose  coefficients  depend  on  t=  t/u. 

Substitution  of  (3.60)  ,  (3.61)  and  the  analogous  expansions  for  z 
into  (3.57)  yields 


,  dv  ,  d 

T—  +  —  Try 
dt  dT  3 


-  uA(y+ny)  +  nBCz+rrz) 


(3.62) 


“  dt  +  d7  ~z  =  C^+TTy)  +  D(z+"z)  . 


Equating  the  coefficients  of  equal  powers  of  u.,  those  depending  on  t  and 
those  depending  on  r  being  treated  separately,  we  arrive  at  the  following 
equations  for  the  variations. 


Zeroth  order 


CyQ(t)  +  DzQ(t)  =  0 

tt*  z(t)  *  Cir  y (t)  +-Dir  z(t) 
a"  ooo 

dy0(t) 

“dt - Ayo(t)  +  Bzo(t) 

dT  TToy(T)  “  0  * 

First  order, 

dz  (t)  _  _ 

-jf - Cy^t)  +  Dz^t) 

drr-z(T) 

- -  *  CT^yOr)  +  DttiZ(t) 

dyT(t) 

-  *  Ay:  (t)  +  Bz1(t) 
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*v‘T) 


AViy<T)  +  BVi2(T)- 


(3.74) 


Since  we  are  considering  the  initial  value  problem,  the 


matching  conditions  become 


t  z(0)  +  z  (0)  =  z 
0  o 


(3.75) 


r  y(0)  +  y  (0)  =  y 
o  o 


(3.76) 


and  for  k  ^  1 


r.  z  (0)  +  z  (0)  =  0 
tc  k 


(3.77) 


nky(o>  +  yk(0)  =  0 


(3.78) 


and,  due  to  our  stability  assumption. 


~ky(x)  =  \z(®)  =  o  k  s  o 


(3.79) 


Solutions  of  this  type  are  referred  to  as  "inner"  and  "outer", 


"fast"  and  "slow",  or  "steady  state"  and  "boundary  layer"  depending  on 


the  author. 


Our  purpose  here  is  to  show  that  the  series 


(3.80) 


are  equivalent  to  the  solution  of  < 3  . 3 )  obtained  using  the  fundamental 


matrix  (3.52)  .  In  other  words, 


(3.8 


i  l  : 

!  i\  :  -p 


y  (0) 


r  tt  T  r  p  i 
i 5  .  .»■ 

t 

:  tt  I 

L  z-  L  J 


(D-+CP)t 


rr  (0). 
2 


The  approach  here  is  to  show  that  yg^ow  and  zs^ow  of  (2.73)  and 
(2.74)  found  using  dominant  left  eigenspace  iterations  are  equivalent  to  y 
and  z  respectively  in  (3.80).  We  then  show  that  y^  and  z£agt  of  (2.84) 
and  (2.85)  found  using  the  dominant  right  eigenspace  iterations  are  equi¬ 
valent  to  ir  and  tt  respectively,  in  (3.80).  This  is  the  purpose  of  using 

y  2 

fundamental  matrix  (3.52)  since  it  expresses  the  solution  in  terms  of  these 
components.  Other  fundamental  matrices  involve  the  Lyapunov  solutions  Q 
and  Q  that  possess  complex  series  expansions  which  we  want  to  avoid. 

First,  we  seek  an  asymptotic  solution  to  yg^ow  of  the  form 


:(t)  =  xQ(t)  +  ^(t)  + 


and  a  solution  of  z  ,  in  the  form 

slow 


-Px(t)  -  -P  x  (t)  -  u.(?  x  +  P  x  ) 
oo  iooi 


where  x(t)  is  the  transformation  variable  of  (2.70).  When  (3-3)  is 
used  as  the  svstem  model 


A-P(u) 3 


57 


Substituting  in  the  formal  power  series  (3.83)  and  (3.7) 
into  (3.84)  and  equating  like  powers  of  u.,  we  obtain  the  following 
equations  of  the  variations 
x  -  (A-BP  )x 

O  O  0 

"  <A’BP0)xrBPlXo 


(3.85) 

(3.86) 
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(A'BP0)y1-BP1yo 


(3.91) 


and  thus,  for  (3.71)  and  (3.73) 


z,  =  -  P  y,  +D 
k  ok 


•1  dzk-l 


!5ci- .  P  fi=i.  P  fizi 

dt  o  dt  1  dt 


k-1  dt 


(3.92) 


z,  =  -  P  y, 
k  -  ok 


d”1?  a  +  d"1?  bp 

o  o  o 


-1  -1  *  L 
-  D  P,  .A  +D  L  Z  P.BP,  .  . 

k'1  j-o  J  k'1'J 


P  y,  -  P.y,  .  -  -  -  P,  V 

oyk  1  k-1  k  o 


*  -  1  vJ* 
1-0  *  J  J 


(3.93) 


(A-BP  )y,  -  3  Z  P,  .v. 

0  k  j=0  *-J  J 


(3.94) 


which  is  equivalent  to  (3.87)  Vk.  Plus,  it  is  obvious  from  (3.93)  that 
7  -  -  (Px)  ,  Vk. 

tC  tv 

We  now  consider  the  fast  components. 

Using  the  dominant  right  eigenspace  iterations,  the  singularly 
perturbed  model  (3.3)  is  transformed  into 
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A-P(u)C 


(3.95) 


where 


x  =  y  -  Pz 
o^o  o 


*o  =  Qyo  +  (I-QP)zo. 


P  is  obtained  from  (2 .26  )  and  Q  is  then  obtained  from  (2 .78 ) . 
The  dichotomic  nature  of  our  fast  and  slow  components  led  us  to  identify 
fast  and  slow  components  of  y  and  z  as 


y  ,  =  (I-PQ)x  y.  =  Pjj 

-’slow  ^  fast 


z  ,  =  -Ox 

s  low 


z .  »  x 

fast 


The  differential  equation  for  the  fast  state  vector  is 


D-iCP  . 
uj  - - - —  Ul. 


(3.96) 


Let  "  *  t/u,  then  (3.96)  becomes 


&  =  ( D  +C  P )  x  (  t  )  . 


(3.97) 


We  now  seek  an  asvmptotic  solution  to  t.  _  of  the  form 

ras 


=  x  (")  +  _  L-  (r) 
o  1 


(3.9S) 


and  a  solution  to  v.  of  the  form 

-  rast 


RU(T)  =  PoXQ(-)  +  aCP^r)  +  P1^o(-)) 


i 
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Substituting  the  formal  power  series  (3.98)  and  (3.27)  into  (3.97) 
and  equating  like  powers  of  u,  we  obtain  the  following  equations  of 
the  variations 


dojQ(T) 


Dui  (r) 
o 


(3.99) 


dm1(T) 


Duj.  +  CP/ju 
1  1  o 


(3.100) 


daHt(T)  «  k;x «  . 

-P  -dVc£,VjV 


(3.101) 


We  will  now  show  that  the  differential  equations  necessary 
to  solve  for  rr^z(r),  k  S  0  are  equivalent  to  (3.101)  Vk.  The  equivalence 

A  -A 

of  r-  y(-r)  and  (F’ii)^  is  a  byproduct  of  this  derivation. 

From  (3.66)  and  (3.79) 


7  =  0. 


(3.102) 


Thus , 


d"  2 

-  =  D1-  z 

d 1  o 


(3.103) 


From  C3.70) 


d**1y(' ) 

—T -  =  A-  v  -r-  B-  2  . 

d~  o'  o 


(3.104) 


\y(~ )  =  "  y ( 0 )  +  [A-  y(T)  +  3-  =(r)]d: 


(3.103) 


To  establish  ^T1y (0) ,  from  (3.79) 
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0  -  n  y(0)  +  j  [A-  y(<r)  +  Bn  z  (a)  . 
10  o 

o 


(3.106) 


Thus , 


CO 

7T1y(T)  *  "  J*  [An^y (c )  +  BnQz(CT)]dc 


(3.107) 


Since 


r 

o 


y(~)  ■  o 


CO 

TT1y(‘r)  »  -  J  Brr  z  (c)dc 
i  w  o 

T 


d“Q2  (cr)dJ 
dcr 


*  -  BD  z(oc)  -  n  2(f)l 
1  o  o  v  ' 

=  BD  n  z(~)  =  P  -  z(-) 
o  1  o 


and  as  a  result 


d-1z(-) 


dr 


"  Dn  z(-)  +  CP  ~  z(r) . 

■L  i  O 


th 


Finally,  for  the  k  variation 

C9 

TV(?>  =  -  [A-.  ,y(-r)  +  Bi~  z(J)]dr 


(3.108) 


(3.109) 


(3.110) 


(3.111) 
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(A-BD  iC)P2rrk_3z(cr)dc: 


09 

...  -  J*  (A-Bd'^Pj^tt  z(ff)dff  (3.112) 


using 


Vi(T)  *  D  -df 


-1  dViz(T)  -1  k*2 . 

~ ^7  -D  CS  *(T) 


k-l-j  j 


(3.113) 


Tky(T) 


;  f  dViz(g) 

1  TJ  da 


da-P  j 


dTTk-2z(a) 


2  "  da 

T 


®  drr  z(a) 

a  r  O 

Pk  J  de  da 


piVi2(t)  +  Vk-2I(T)  +  ••••  +  PkV(T> 


(3.114) 


S  P,  .r.z(-r) 

j=0  k’J  J 


Thus , 


drkz(T) 


=  Dtt  z(T)  +  C  E  P,  .n-.z(T) 

k  j_q  k-j  j 


(3.116) 


T.vhich  is  equivalent  to  (3.101),  Vk  ^  0.  Also,  from  (3.113), 


<rA  *  <PA  Yk  2  0 


(3.117) 


Thus,  we  have  shown  that  y  satisfies 

o 


dT  -  <A  -  3P^0 


i 3.113) 


and  thac 


d", ( - ' 


=  n 


-  3.119) 
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likewise 


Thus 


77 

y 


z 

o 


P77z(T) 

-pyo(t). 


y(t) 


yo(t)  +  7ry(T) 

e(A-BP)t-o(0)  +  pe(IHCP)xirz(o) 


z(t) 


5o(t)  +  nz(r) 

.Pe(A-BP)t,  ( 

'  o  z 


y(t) 

_z(t) 

L 

(A-BP)t 


-Pe 


(A-BP)C 


Pe 

e 


(D+CP)  t 
(D+CP)r 


yo(°) 

77  (0) 
z 


J 


The  Tnatching  conditions  (3. 75 )- ( 3 - 80)  thus  reduce  to 


yQ  =  y(0)  +  rry(0) 

-  z (0)  +  ~z(0)  . 


However,  from  (3.120) 


yo  -  7(0)  +  P~z(0) 
z  =  -  Py(0)  +  Tfz(0) 

O 


fv  1 

r  I 

Pi 

|*y(0)  1 
1 

■  o 

z 

o 

-p 

I 

t 

1 

N 

O 

i 

Thus  , 


—  - 

r  - 1  r  i 

y  (0) 

I  Pi  1  V 

- 

i  ■ 0 

X  (0) 

-P  I  •  j  z 

-  z 

*  -  Loo 

(3.120) 


(3.121) 


(3.122) 


(3.122) 


(3.124) 


(3.125) 


(3.126) 


(3. 


) 


(3.125) 
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which,  together  with  (3.123),  uniquely  defines  the  asymptotic  series  solution 
to  (3.3)  in  terms  of  fundamental  matrix  (3.52). 

In  conclusion,  given  the  initial  value  problem  (3.3)  that 
satisfies  (2.32),  the  presence  of  the  singular  perturbation  parameter  u 
suggests  seeking  a  series  solution  in  two  time  scales.  Vasil'eva's  method 
of  matched  asymptotic  expansions  is  used  for  decomposing  the  solution  vector 
of  (3.3)  into  fast  and  slow  series  components  that  are  individually  easier 
to  solve  than  the  original  higher  order  system.  The  attractive  feature 
of  series  solutions  is  that  no  state  transformation  is  necessary.  The  series 
terms  for  the  actual  states  are  computed  as  they  are  needed.  In  the  eigen- 
space  iterations,  this  is  not  true.  Transformed  fast  and  slow  states  are 
found,  solved,  and  the  actual  states  attained  through' an  inverse 
transformation.  From  a  computational  point  of  view  this  method  has  to  be 
preferred  since  only  two  reduced  order  systems  of  differential  equations  are 
solved  in  attaining  the  high  order  solution.  In  the  series  method,  two 
reduced  order  systems  of  differential  equations  are  solved  for  every  term 
in  the  series.  Thus,  it  is  practical  only  when  the  number  of  terms  in  the 
series  required  is  small.  In  this  section  we  have  shown  that  the  separation 
of  time  scales  attained  in  asymptotic  series  solution  is  equivalent  to  the 
dominant  eigenspace  decompositions  of  Chapter  2,  in  that  the  convergence  of 
both  methods  is  dependent  on  the  existence  of  a  dichotomic  solution  to  the 
Riccati  equations  (2.19)  and  (2.26).  Thus,  for  a  given  state,  the  "fast' 
and  "slow"  components  of  that  state  obtained  using  either  decomposition 
algorithm  will  be  the  same. 
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3.5.  Simplified  Iterative  Schemes 

One  of  the  computational  drawbacks  of  the  dominant  left  and  right 
eigenspace  iterations  is  the  computation  of  the  inverses 


(D  +  PkB)_1 

(3.129) 

(D-  CPk)_1 

(3.130) 

at  every 

iteration. 

Looking  at  the  left  eigenspace  case,  the 

iterative  matrix  recursion 

is 

pk+i  ■  <»+vrl(c+V) 

(3.131) 

which  can  be  expressed  as 

DPk+l  =  -PkBPk+l  +  C+PkA‘ 

(3.132) 

If  this 

is  approximated  by 

DVi  ■  -vvc+ v 

Vi  ■  »'1(c,v-vV' 

(3.133) 

Then  we  will  have  eliminated  the  need  for  the  inverse  in  (3.131)  at  ever*/ 
iteration. 

Likewise,  looking  at  the  right  eigenspace  case,  the  iterative 
matrix  recursion  is 

Pk+1  =  (B  +  APk)(D  +  CPk)_1  (3.139) 

which  can  be  expressed  as 

?,  ,0  =  ^CP,  +  B  + AP,  . 
k— 1  k+1  x  x 


(3.135) 


*  1  ■,  "I  »l— >— -T-IMI  *  l 
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If  this  is  approximated  by 


=  -\C\+E+^ 

A  A  ^ 

Vi  ■  <-vVB+Afk)D' 


(3.136) 


we  will  have  again  eliminated  the  need  for  taking  the  inverse  in  (3.134)  at 
every  iteration.  [10]  has  used  a  contraction  mapping  argument  to  develop 
conditions  under  which  (3.133)  will  converge  to  the  dichotomic  Riccati 
solution.  This  methodology  can  easily  be  extended  to  (3.136).  When  the 
conditions  of  [10]  are  satisfied,  (3.133)  and  (3.136)  are  computationally 
superior  to  (3.131)  and  (3.134).  Unfortunately,  the  conditions  of  [4]  are 
somewhat  conservative  and  are  not  satisfied  by  many  systems  which  we  know 
can  be  decomposed  using  (3.131)  and  (3.134). 


3.6.  Partial  and  Full  Pole  Placement 

There  are  many  applications  using  the  techniques  developed  in 
Chapters  2  and  3.  They  include  robust  designs,  reduced  order  regulator 
designs,  and  reduced  order  modeling  to  only  mention  a  few.  In  this  section 
we  will  show  how  the  time  scale  decomposition  techniques  can  be  used  to  imple¬ 
ment  partial  or  full  pole  placement  design. 

We  will  now  be  considering  the  completely  state  controllable 

system 


y 

< 

1 _ 

II 

3" 

r- 

V 

4- 

r  *r 

G 

Z 

1  r 

D 

z 

»  J 

H 
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If  the  open  loop  eigenvalues  satisfy  (2.32),  then  we  can  apply 
transformation  (2.17)  which  transforms  (3.137)  into 


A-BP  B 


B  y  G 

+ 

D+PB  n  h+pg 


(3.138) 


where  P  is  obtained  using  either  (3.131)  or  (3.133).  The  transformation 
involved  here  can  be  written  as 


I  0  I  y 


I  s 


(3.139) 


which  possesses  the  explicit  inverse 


o  lly 


-p  I  n 


(3.140) 


Observe  now  that  the  pair  (D+PB,  H+PG)  spans  only  the  "fast"  controllable 
subspace.  Let  D  =D  +  PB  and  H  »H  +  PG  and  design  a  feedback  gain  F  such  that 


(D*  +  H*F) 


has  ra  desired  poles. 


The  control  is  of  the  form 


u  *  pi 


(3.141) 


=  F  ( Pv  +  z  ) 

=  [FP  :  F]|  ;  j 


(3.142) 


and  the  resulting  closed-loop  system  has  n  eigenvalues  according  to 


cr(A-BP) 


(3.143) 


and  m  eigenvalues  according  to 


'  (D*  +  H*F) . 


(3.144) 


I 


Now  apply  transfo-mation  (2.24)  to  (3.137).  This  gives 


C  A-PC  0  C  G-PH 

*  +  u 

z  C  D+CP  z  H 

where  P  is  obtained  using  either  (3.134)  or  (3.136).  The  transformation 
involved  here  can  be  written  as 

'si  "i  -P~|  Ty  * 

*  ( 

z  0  I  z 


(3.145) 


which  has  the  explicit  inverse 


P  S 
I  z 


(3.146) 


Observe  now  that  the  pair  (A-PC,  G-PH)  spans  only  the  "slow"  controllable 
subspace.  Let  A* *  A-PC  and  G* =  G-PH  and  design  a  feedback  F  such  that 


(A*  +  G*F) 


has  n  desired  poles. 

The  control  is  of  the  form 

u  =  FS 

*  F(y-Pz) 

-  IF  :  -FFl  [l] 

and  the  resulting  closed  loop  system  has  n  eigenvalues  of 


a(A*  +  G*F) 


and  m  eigenvalues  of 


(3.147) 


(3.148) 


o(D+  CP) 
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In  general,  both  slow  and  fast  modes  may  be  designed  for.  In  this  case  a 
general  two-state  design  procedure  may  be  implemented.  Assume  we  have  block 
diagonalized  (3.137)  as  done  in  Section  2.3. 


and  thus  can  be  solved  algebraically  [32].  If 


m±nja(D*)|  >  max| o(A*  +  G*!^)  |  .  (3.154) 

Then  the  iterative  scheme  (2.80)  may  be  used  to  solve  (3.152).  With  this  S,  the 
pair  (D*,  H*+SG*)  now  spans  only  the  "fas tf' controllable  subspace,  and  we  can 
design  a  feedback  gain  such  that 

a(D*+  (H*+SG*)Ff) 

has  m  desired  eigenvalues. 

Thus,  the  composite  control  is 

u  *  F  x  +  Frv 
s  f 

*  G  x  +  Fc(w+  Sx) 

s  f 

*  (F  +  F,S)x  +  F.w 

s  f  f 

-  ((Fs  +  FfS):Ff][*j  (3.155) 

and  using  either  transformation  (2.72)  or  (2.83),  (3.155)  can  be  expressed  in 

terms  of  our  original  state  variables.  This  control  places  n  eigenvalues  of 


a(A*  +  G*F  ) 
s 

and  m  eigenvalues  of 

o(D*  +  (H*  +  SG*)Ff ). 


(3.156) 


(3.157) 


This  technique  has  been  applied  to  singularly  perturbed  systems  [42]  where  it 
is  shown  to  be  a  generalization  to  results  obtained  in  [7,26,40].  This 
technique  is  also  applicable  to  discrete  time  models  as  shown  in  [43].  In  this 
case,  the  dominant  eigenvalues  are  part  of  the  "slow"  spectrum. 
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CHAPTER  4 

SINGULAR  PERTURBATION  MODELING  OF  MARKOV  CHAINS 


K 


n 

K 


4.1.  Introduction 

The  previous  two  chapters  have  unified  and  extended  many  results  on 
the  control  and  analysis  of  deterministic  two -time- scale  systems.  For  stochas¬ 
tic  systems  modeled  as  Markov  chains  we  can  extend  the  theory  of  time  scale 
decomposition  to  these  probabilistic  models  where  now  "slow"  and  "fast"  eigen- 
modes  correspond  to  "weak"  and  "strong"  transition  probabilities. 

Markov  chain  models  are  well  known  in  the  analysis  and  control  of 
stochastic  systems  [14,15,21,44].  Until  recently,  however,  this  analysis  was 
limited  to  simple  systems  due  to  the  tremendous  dimensionality  requirements 
of  most  Markov  models.  Recent  applications,  such  as  management  of  hydrodams 
[22,25]  and  queueing  network  models  of  computer  systems  [21,45,46],  have 
accentuated  the  need  for  reduced  order  approximations  of  large  scale  Markov 
chains.  In  this  regard  particularly  promising  is  a  perturbational  decomposi¬ 
tion-aggregation  method  of  Pervozvanski,  Smirnov  and  Gaitsgori  [23,24,47,48], 
and  Delebecque  and  Quadrat  [25,49].  The  method  assumes  that  the  groups  of 
strongly  interacting  states  are  known  and  treats  the  weak  interactions  between 
these  groups  as  perturbations.  The  result  is  a  short-term  decomposition.  Over 
a  longer  period  the  weak  interactions  become  significant,  while  each  group  of  strong¬ 
ly  coupled  states  can  be  replaced  by  an  aggregate  state.  A  long-term  aggre¬ 
gate  model  is  thus  obtained.  It  is  the  purpose  of  this  chapter  to  snow  how 
such  weakly  coupled  Markov  processes  can  be  modeled  as  a  ^''ngularly  perturbed 
system.  This  enables  us  to  apply  the  decomposition  techniques  of  chpaters  2 


and  3 . 


In  Section  4.2  we  introduce  the  two-time-scale  Markov  chain.  Slow 
and  fast  components  of  the  chain  are  identified  and  used  to  construct  the 
singularly  perturbed  model. 

In  Section  4.3,  a  grouping  algorithm  [34]  to  used  to  show  how  the 
states  of  an  arbitrary  Markov  chain  can  be  ordered  so  as  to  exhibit  the  two- 
time-scale  property. 

Finally,  in  Section  4.4,  a  two-time-scale  asymptotic  expansion  of 
the  steady  state  probability  distribution  is  constructed  using  the  singularly 
perturbed  model.  An  example  is  then  given  to  show  how  the  series  solutions 
can  be  used  to  efficiently  calculate  the  invariant  probability  measure  of  a 
large  queueing  network. 

4.2.  Singular  Perturbation  Modeling 

In  this  section  we  introduce  the  two-time-scale  Markov  chain  and 
show  how  it  can  be  put  into  standard  singularly  perturbed  form. 

Consider  the  n-state  Markov  chain 

3  P(T)(A  +  cB)  (4.1) 

where  p(T)  is  the  n-dimensional  row  vector  of  probabilities  p^(t)  to  be  in 
state  i  at  time  t.  Hence, 

n 

itiPi<T)  *  1  Ti0*  (4.2) 

A  and  B  are  both  n-dimensional  Markov  generators  and  A  has  the  form 


1 


(4.3) 


where  ,  j»l,...,N  are  dimensional  Markov  generacors. 

Thus,  in  (4.1)  N  groups  of  strongly  interacting  states  have  been 

N 

identified.  Group  j  consists  of  n^  states  and  j^Qj  aQ*  The  weak  interactions 
between  states  in  different  groups  are  modeled  as  multiples  of  a  small  posi¬ 
tive  scalar  e.  We  assume  throughout  the  thesis  that  for  0<e*e*  the  process 
(4.1)  has  a  single  ergodic  class  with  unique  stationary  probability  distribu¬ 
tion  p  defined  by 


0  »  p(A  +  e  B) 


(4.4) 


Furthermore,  let  each  of  the  N  generators  A^  define  a  Markov  process  with  a 
single  ergodic  class.  This  implies  that  each  A^  has  one  zero  eigenvalue. 

The  corresponding  right  eigenvector  t^  is  the  n^ -dimensional  column  made  of 
ones.  The  left  eigenvector  is  the  n^ -dimensional  row  of  stationary  prob¬ 
abilities  for  the  states  in  the  j-th  group  when  e=*0  in  (4.4).  The  matrix  form 

of  A.t.  *0,  v.A  *0,  and  v .  t .  *  1,  j  *  1, . . .  ,N,  is 
J  J  J  J  J  J 


AT  *  0,  VA»0,  VT  *  I 


(4.5) 


0  ... 

0  “ 

> 

0 

0 

...  0  . 

1 

0  ... 

0 

0 

V, 

0 

1 

...  0 

0  0 


!  ,  V 


0  .  . .  o  vN 


(4.6) 
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where  1^  is  the  NxN  identity,  the  j-th  row  of  T  is  made  of  n^ -dimensional 
columns  and  the  j-th  column  of  V  is  made  of  n. -dimensional  rows,  that  is  T 
is  nxN  and  7  is  Nxn.  The  influence  of  weak  interactions  «B  in  (4.1)  will 
become  significant  after  a  long  period  of  time  T.  Hence  T-scale  is  called 
"fast  time."  To  see  the  influence  of  C  "sooner"  we  introduce  the  "slow  time" 
t*er.  If,  for  example,  T  is  in  weeks  and  6  ,  then  t  is  in  years.  In 

t-scale  model  (4.1)  becomes 


p(t)  -  p(t)(^  +  B) 


(4.7) 


where  the  dot  denotes  Initially  p(t)  will  rapidly  approach  the  null-space 

T 

of  A  as  if  the  N  groups  were  separated  from  each  other.  After  that,  pB  is  no 
longer  negligible  with  respect  to  p  — .  This  behavior  is  a  characteristic  of 
singularly  perturbed  systems  [3,13,35],  As  in  [35]  our  goal  is  to  transform 
(4.7)  into  a  standard  singularly  perturbed  form  which  makes  the  slow  and  fast 
parts  of  p(t)  more  explicit.  For  N  slow  variables  we  take  the  elements  T|^.  of 
the  row 

"  *  pT  (4.8) 

because  is  the  probability  of  the  process  (4.7)  to  be  in  group j .  Since  the 
j 

transitions  between  the  groups  are  slow,  T|  will  change  slowly.  After  the 

fast  transient  is  over,  probability  p^  is  approximated  by  ”jvj»  where  v^  is  the 

stationary  probability  for  the  process  to  be  in  state  i,  once  it  is  in  group  5. 

Thus  the  difference  Y  .  =  p .  -7.  .  v1  is  the  fast  part  of  p.  .  Of  n  such  differences, 

L  i  j  J  ’ 

n-N  are  independent  and  are  defined  by 


YW  =  p  -  ~V,  WT  =  0. 


(4.9) 
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W  is  an  (n-N)xn  block-diagqnal  matrix  of  the  form 


(4.10) 


where  each  is  an  (T,^-l)xT|^  matrix  of  the  form 


1  0 


0  1 


0  0 


0  0-1 


0  0-1 


10-1 


0  1-1 


(4.11) 


This  defines  the  transformation 


P  -  [ 


•„Y]  [V 

W 


(4.12) 


to  perform  the  inverse  transformation,  we  define  the  n  x (n-N)  block  diagonal 
matrix  S  such  that 


WS  *  I  VS  *  0. 
n-N’ 


(4.13) 


gives  the  explicit  inverse  of  (4.12)  as 


n.Y!  =  p  £  r ,  s  ] 


(4.14) 


:he  form 
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N 


(4.15) 


and  each  is  an  n^  x(n^  -  1)  matrix  of  the  form 


i-v1  -v2 

j  j  j 


s. 

J 


1  1  2 

-v.  l-.j 


n  j-1 
■v  J 
J 


1-v 


nj-1 


L 

l_”Vj 


-v 


j 


j  J 


(4.16) 


using  (4.14)  and  (4.12),  (4.7)  can  now  be  transformed  into  a  standard  singular 
perturbation  form 


"iVBT  +  YWBT 


eY  =>  SmVBS  +  YW (A  +  eB)S 


(4.17) 


whose  properties  are  well  known  [3,12].  The  crucial  stability  condition  on 
W(A  +  eB)S  is  satisfied  by  the  fact  that  the  j-th  block  of  the  block  diagonal 
matrix  WAS  is  deflated  for  the  zero  eigenvalue.  Thus,  from  the  basic 
assumptions  on  (4.1),  Re  ^V(WAS)<0,  i«l,...,n-N.  Assuming  that  sY  ~  0  and 
subs  tituting 

Y  *  -eTJVBSfWAS)-1  (4.18) 
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from  (4.17)  an  ^-corrected  slow,  node 1  Is  obtained 

f,  -  1](VBT  -  «VBS  (WAS )  ” HfBT )  '(4.19) 

which  for  c«Q  reduces  to  the  aggregate  model  obtained  In  [23].  Note  that  the 
Inverse  in  the  correction  term  is  "decentralized,"  and  consists  of  N  inverses 
of  smaller  diagonal  blocks.  From  (4.18)  we  see  that  the  slow  part  of  Y  is  only 
0(c).  If  we  express  (4.17)  in  the  fast  time  scale  T-t/c,  or,  equivalently, 
if  we  apply  transformation  (4.12)  to  (4.1),  we  would  obtain 

ff-0(e),  g-YWAS  +  0(c).  (4.20) 

Thus,  in  the  fast  time  scale  as  c-0,  the  slow  variable  •]  tends  to  a  constant 
and  the  system  matrix  for  fast  variable  Y  is  WAS.  In  this  manner  two-time 
scale  asymptotic  expansions  for  '.j,  Y  will  be  constructed  in  section  4.4.  Note 
that  in  the  fast  time  T,  to  0(c),  the  interactions  between  groups  are  neglected. 
In  the  slow  time  t  this  is  not  true.  This  points  out  the  necessity  of  posing 
the  problem  in  the  slow  time  scale  if  we  are  to  use  perturbation  methods  in 
the  analysis  and  control  of  Markov  chains  on  the  infinite  horizon. 

Let  us  now  consider  the  discrete  time  model 

P(k+1)  -  p(k)P  >  p(k)(j  +  I  +  B)  (4.21) 

where  P  is  the  probability  transition  matrix  and  A  and  3  are  generators.  As 
in  (4.7),  the  strong  interactions  appear  as  multiples  of  j  ,  that  is}  (4.21) 
is  expressed  in  the  slow  time  scale.  The  transformation  (4.12)  results  in 


T](k+1)  »  *,(k)  (I  +  VBT)  +  Y  (k)WBT 
Y (k+1)  -  ~(k) VBS  +  Y(k)W(^  +  I  +  B)S. 


(4.22) 


Properties  of  this  type  of  discrete  time  models  are  discussed  in  [43,50,51]. 

•  • 

Note  that  the  same  model  is  obtained  from  (4.17)  if  m,  Y  are  replaced 

M(k+1)  -  T](k),  Y(k+1)  -  Y(k), 

that  is  for  the  step  size  1.  The  slow  model  analogous  to  (4.19)  is  obtained  by 
neglecting  e[Y(k+l)  -Y(k)]. 

Throughout  the  remainder  of  this  thesis  singularly  perturbed  models 
(4.17)  and  (4.22)  will  be  used  extensively.  It  is  important  to  realize  that 
by  just  finding  the  steady  state  probabilities  v. ,  j-l,...,N,  we  can  construct 
the  transformations  (4.12)  and  (4.14).  Thus,  these  singularly  perturbed  models 
are  obtained  with  little  computational  burden  once  the  structure  of  (4.1)  is 
identified.  The  problem  of  permuting  the  states  of  an  arbitrary  generator  to 
exhibit  this  form  is  the  topic  of  the  next  section. 

4.3.  Ordering  of  States 

The  preceding  section  assumes,  as  do  the  earlier  references  [23-25, 
47-49] ,  that  the  N  groups  of  strongly  interacting  states  are  known  and  the 
generator  of  the  process  (4.1)  is  of  the  form  GaA  +  eB,  where  A  is  block-diagonal 
and  cB  is  small.  This  situation,  convenient  for  an  asymptotic  analysis,  is 
seldom  met  in  reality.  The  ability  to  identify  groups  of  strongly  coupled 
states  given  an  arbitrary  generator  is  an  important  modeling  task.  In  this 
section  we  address  this  task.  Our  main  tool  is  a  state  "grouping"  algorithm, 
developed  in  [34]  for  power  system  matrices  which  we  apply  here  to  Markov 
generators . 


Consider  the  generator 
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(4.23) 


This  generator  describes  the  transitions  between  the  states  in  a  queueing  network 
of  a  computer  system  [21,46]  consisting  of  a  filing  device  D,  a  secondary 
memory  M,  and  a  processor  C.  Assuming  that  there  are  three  users,  the  states 
x^,...,x^q  are  defined  inTable4.1  whose  entries  are  the  numbers  of  jobs  inD,C, 
and  M  queues. 
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Table  4.1.  States  of  Queueing  Network  Model 

The  main  difficulty  in  determining  whether  a  state  interacts  weakly 
with  a  group  of  states  is  that  its  interactions  with  each  state  in  the 


group 
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can  be  small,  buC  the  sum  of  these  interactions  can  be  significant  enough  to 
be  considered  strong.  In  other  words,  in  practice  e  is  not  infinitesimal  and, 
if  e*0.1  is  considered  as  weak,  6c  Is  strong.  Thus,  already  for  (4.23),  and 
certainly  for  more  complex  forms  of  G,  a  systematic  procedure  is  required  to 
determine  the  strongly  interacting  states.  Such  a  procedure  is  Avramovic's 
grouping  algorithm. 

The  grouping  algorithm  is  based  on  the  following  property  of  a  process 
with  as  yet  unknown  groups  of  strongly  interacting  states:  If  there  are  N 
such  groups,  then  matrix  G  will  have  N-l  small  eigenvalues  which  are  clustered 
near  its  zero  eigenvalue.  Let  the  columns  of  an  nxN  matrix  M  be  the  right 
eigenvector  of  G  for  the  N  smallest  eigenvalues,  including  ^*0.  Each  row  of 
M  corresponds  to  one  of  the  n  states.  We  observe  that  T  in  (4.6)  is  the  limit¬ 
ing  form  of  M  when  interactions  are  neglected  and  the  states  are  grouped. 

Note  that  states  in  the  same  group  have  identical  rows  in  T  while  states  not 
in  the  same  group  have  rows  in  T  that  are  perpendicular  to  one  another.  By 
continuity  we  expect  that  the  corresponding  rows  in  M  should  be  "nearly  identical" 
and  hence  close  to  being  linearly  dependent.  Instead  of  investigating  "nearly 
identical"  rows  of  M,  Avramovic's  algorithm  does  the  opposite:  it  starts  by 
determining  N  rows  of  M  which  are  as  linearly  independent  as  possible.  In  the 
algorithm,  these  rows  are  found  by  a  simple  Gaussian  elimination  with  full 
pivoting.  The  corresponding  N  states  are  called  the  reference  states  around 
which  the  remaining  n-N  states  should  be  grouped.  When  the  N  reference  rows  of 
M  are  found,  a  permutation  rr  is  performed  so  that  these  rows  appear  as  the 
first  N  rows.  Thus  the  NxN  matrix  in 


i 


is  nonsingular  and  a  new  basis  of  Che  same  eigenspace  is 


(4.25) 


In  [28]  important  properties  of  matrix  L  are  deduced  from  the  fact  that  it  is 
the  "dichotomic”  solution  of  the  Riccati  equation  (2.26).  A  property  to  be 
used  here  is  that  the  sum  of  entries  in  each  row  of  L  is  1.  Thus,  if  M  has 
"nearly  identical"  rows,  each  row  of  L  will  have  an  entry  close  to  1,  and  all 
other  entries  close  to  0.  The  criterion  for  grouping  is  simple.  A  row  of  L 
belongs  to  the  group  defined  by  that  reference  row  which  has  entry  1  in  the 
same  column  in  which  the  examined  row  of  L  has  its  largest  entry. 

We  now  proceed  to  apply  this  algorithm  to  determine  four  groups  of 
strongly  interacting  states  in  (4.23).  The  four  smallest  eigenvalues  of  G  are 
0,  -0.025,  -0.065,  -0.107.  The  eigenvector  matrix  M  and  the  matrix  (4.25)  are 
as  follows  : 


Note  that  the  rows  are  labeled  with  the  index  of  the  state.  An  excellent 
grouping  is  achieved,  because  each  row  of  L  has  one  distinctly  large  entry. 
Therefore  the  groups  are  i4,7,9,10},  13,6,8},  (2,5),  llj.  The  permutation  of 
the  generator  (4.23)  to  this  ordering  of  the  states  is 
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6  . 36 
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© 


'5 


where  Che  weak  coupling  is  apparent.  The  physical  interpretation  of  Che 


grouping  is  now  clear.  The  aggregate  T,^(t)  is  the  probability  that  at  time 
t  there  are  j-1  jobs  in  the  D-queue.  This  is  intuitively  clear  since  the  mean 
service  time  of  a  filing  device  D  is  typically  ouch  slower  than  that  of  second¬ 
ary  memory  M  or  processor  C.  The  Y(t)  variables  describe  fast  fluctuations 
between  the  C  and  M  while  the  D-queue  is  in  a  given  state.  The  accuracy  of  the 
approximation  using  the  aggregate  matrix 


VBT  - 
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.018 
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0 

.05 

-.05 

(4.28) 


can  be  judged  from  the  fact  that  its  eigenvalues  0,  -.027,  -.071,  -.118  are 
close  (less  than  107.  error)  to  the  corresponding  eigenvalues  of  G.  With  a 
correctec  model  (4.19)  they  are  within  2%. 


4.4.  Two -Time -Scale  Asymptotic  Expansion 

One  of  the  applications  of  the  singularly  perturbed  model  (4.17)  is 
the  ability  to  obtain  a  two-time-scale  asymptotic  expansion  of  the  solution 
vector  p(t).  Then  "slow"  and  "fast"  components  can  be  solved  independently 
rather  than  solving  the  high  order  system  of  "stiff"  differential  equations. 

It  is  the  purpose  of  this  section  to  construct  such  an  expansion.  As  time 
tends  to  infinity,  the  asymptotic  series  of  differential  equations  will  reduced 
to  algebraic  equations  which  can  be  used  to  solve  for  p  in  (4.4)  in  a  computa¬ 
tionally  attractive  manner.  The  ability  to  compute  p  efficiently  for  large 
chains  has  many  applications  [46]  . 


We  seek  a  solution  to  (4.17)  of  the  form  [8,13] 


‘1  -  1](t)  +  L-j(T)  (4.29) 

Y-f(t)+Ly(T)  (4.30) 


where  each  term  is  a  power  series  in  e  with  coefficients  depending  either  on  t, 
for  the  slow  ("outer")  series,  or  on  T  »  for  the  fast  ("inner")  series. 


U(t)  *  'lg(t)  +  cT]^(t)  +  ...  (4.31) 

U(T)  -  L°(T)  +  sli(T)  +  ...  (4.32) 

'i  4  '1 

Y(t)  -  £0(t)  +  e£1(t)  +  ...  (4.33) 

Ly(T)  -  I^(T)  +  eL.J-(T)  +  ...  (4.34) 


Substituting  (4.29)  through  (4.34)  into  (4.17)  and  equating  the  terms  with  like 
powers  in  «,  separately  for  t  and  T  series,  we  obtain,  for  zeroth  order  terms 


\(t)  =  nrt(t)VBT  7,  (0)  -  7,(0) 

OO  0 

(4.35) 

L?(T)  =*  0 
•1 

(4.36) 

/-\ 

ft 

V-/ 

u 

o 

(4.37) 

dL? (t) 

"d—  =  ly(T)WAS  L°(0)  -  Y(0) 

(4.38) 

We  see  that  within  0(e)  the  fast  part  of  .,(t)  and  the  slow  part  of  Y  (t)  are 
zero  for  alL  t.  Furthermore,  due  to  the  asymptotic  stability  of  (4.38)  the  fast 
term  L^(T)  -0  as  T=-|-J3.  For  small  e  thi  ;  "boundary  layer  term"  is  negligible 

A  A 

for  all  t>t,  where  t  is  of  order  -e*ne.  Thus,  for  t>t  p(t)  is  approximated 
within  0(e),  by  ~Q(t)V. 
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For  first  order  terms  we  obtain 


\(t)  -  \(t)VBT  +  XL(t)WBT 

(4.39) 

dT  *VT)WBT 

(4.40) 

?L(t)  -  il0(t)VBS(WAS)“1 

(4.41) 

dVT>  1  o 

— ^ —  -  Lj(T)WAS  +  L^(t)WBS 

(4.42) 

1a  1  .A 

The  matching  conditions  become  L-r(O)  *-m^(0)  and  L  (0)  »-(^(0). 

By  direct 

integration  of  (4.40)  we  obtain 

T 

iij(T)  -  lij(0)  +  jL°(cr)WBTdff. 

(4.43) 

Since  L^(oa)  =  0, 

•1 

00 

li(T)  =»  -J  L^(<7)WBTdc 

T  T 

(4.44) 

using  (4.38),  this  becomes 


dL^(c)  ^  . 

iJj  (T)  =■  -j'  —  (WAS ) "  WBT  do  ■  -fL^(a)  -  lJ(t)]  (WAS)‘^BT 


(4.45) 


ly  (  T  )  (WAS  ) 


thus  at  each  stage  only  separate  fast  and  slow  differential  equations  need  to  be 
solved.  An  important  property  is  that  the  fast  equations  are  "decentralized" 
groups  of  states  due  to  the  fact  that  WAS  is  block-diagonal.  From  (4.35) 
through  (4.45)  we  have 
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71  (t)  -  *Q(t)  +  e(^(t)  +  li(T))  +  0(?2) 
Y(t)  -  I^(T)  +  etf^t)  +  lJ(t))  +  0(e2) 


(4.46) 


higher  order  terms  are  obtained  similarly.  The  computational  advantages  of  the 
eigenspace  iterations  over  the  series  solutions  have  been  discussed  in  Chapter 
3.  However,  for  an  intuitive  understanding  of  how  the  solution  vector  decomposes, 
the  series  solutions  are  more  attractive.  As  t— ®,  the  L-terms  vanish  and  the 
equilibrium  or  steady  state  distribution  p  is  given  as 


P  *  CHq  +  «7)^  +. . .  )V  +  (eY^  +  e2Y2  +. . .  )W  (4.47) 

—  A  —  * 

where  T,^»~^(t)  and  Y^»Y^(t)  as  t-*08*  The  terms  in  (4.47)  are  easily  computed 
from  the  following  sequence  of  algebraic  equations. 


T.qVBT  =  0  .2^0  =*  1 

(4.48) 

Yi  =  s0VBS(WAS)"L 

(4.49) 

N 

~tlVBT  +  TjWBT  -  0  =  0 

(4.50) 

Yk  *  -  (Y k_  lWBS  +  lVBS  )  (WAS ) ' 1 

(4.51) 

N  . 

7,kVBT  +  YkWBT  =  0  =  0 

(4.52) 

Note  that  VBT  is  an  NxN  matrix  and  W^A^Sj  Is  (“j _^)x(~j _^)  for  j*l,...,N. 

Thus  we  need  only  solve  N  and  ~  -1, i»l, . . . ,N  dimensional  systems  of  linear 
equations  to  obtain  a  good  approximation  to  p.  If  we  were  to  solve  (4.4) 
directly  not  only  can  the  dimension  of  A  + eB  be  large,  but  also  the  presence  of 
could  result  in  an  ill-conditioned  problem.  Note  that  (4. 48)  -  (4.52 )  are 
independent  of  e. 


Queueing  network  models  of  computer  systems  are  often  in  the  form  of 

a  large  Markov  generator  [21,45,46].  Performance  analysis  measures  are  usually 

functions  of  p.  By  using  (4.48) -(4.52)  these  performance  measures  can  be 

efficiently  computed  on  the  subsystem  and  aggregate  level.  We  illustrate  this 

concept  on  the  model  used  in  [46] .  In  this  paper  various  iterative  techniques 

for  computing  p  are  compared  from  a  computational  point  of  view.  Direct  methods 

of  solution  are  not  considered  due  to  the  large  storage  requirements.  However, 

it  is  acknowledged  that  the  results  obtained  using  direct  methods  are  more 

accurate  and  require  substantially  less  total  time  than  iterative  techniques. 

The  singular  perturbation  method  we  have  proposed  is  a  hybrid  of  direct  and 

iterative  methods.  Each  series  term  of  p  is  solved  directly  on  a  significantly 

"reduced  order  basis.  However,  to  i  jve  the  approximation  successive  series 

terms  must  be  solved  (which  can  be  viewed  as  an  iterative  process  with  conver- 
k 

gense  rate  €  ).  We  will  now  compute  p  using  (4.48)-(4.52)  for  the  model  in  [46]. 
This  queueing  model  represents  the  architecture  of  a  time-shared  multiprogrammed 
paged  virtual  memory  computer  system.  Assuming  three  jobs  in  the  system,  the 
Markov  generator  is  of  20th  order.  Using  the  asymptotic  series  solutions  (4.47)- 
(4.52),  P0*”!qV  and  p^  *  1|qV  +  e  (T,^V  4-  Y  ^W)  were  computed  and  compared  to  the  actual 
steady  state  distribution  p  as  follows: 


0.0002493” 

0. 0002483- 1 

~0. 0002493“ 

0.0004452 

0.0004455 

0.0004452 

0.0007962 

0.0007992 

0.0007961 

0.0014284 

0.0014338 

0.0014282 

0.0002509 

0.0002502 

0.0002508 

0.0004483 

0.0004489 

;  0.0004483 

0.0008027 

0.0008052 

|  0.0008028 

0.0002530 

0.0002502 

|  0.0002530 

0.0004527 

0.0004489 

|  0.0004527 

0.0002530 

p0  3 

0.0002502 

P1  = 

0.0002530 

0.0082679 

0.0082697 

|  0.0082677 

0.0080455 

0.0080753 

'  0.0080451 

0.0078604 

0.0078856 

0.0078601 

• 

0.0082979 

0.0082697 

:  0.0082984 

: 

0.0080917 

0.0080753 

i  0.0080921 

0.0082914 

1 

0.0082697  ! 

0.1049442  ; 

0.0082919  | 

0.1049480 

0.1049481  ; 

I 

0.0363037 

0. 0362582 

:  0.0363041  1 
' 

0.1048605 

0.1049442 

1  0.1048591  ' 

0.6996535 

L  -J 

0.6996279 

L  -4 

0. 699654  lj 

Note  the  accuracy  using  just  one  or  two  terms  in  the  series. 

In  the  remaining  chapters  the  concepts  developed  in  this  chapter 
are  applied  to  controlled  Markov  chain  problems.  Efficient  two -time- scale 


design  algorithms  result  in  a  near  optimal  control  policy. 
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CHAPTER  5 

THE  DISCOUNTED  COST  PROBLEM  FOR  CONTROLLED  MARKOV  CHAINS 


5.1.  Introduction 

In  this  chapter  we  consider  the  controlled  Markov  chain 


p(t)  -  p(t)<^- +  B(u)).  (5.1) 

When  the  process  is  in  a  given  state  x,  the  control  policy  u(x)  determines  the 
transition  rates  of  the  chain. 

We  assume  that  a  . (u(i))  and  b_(u(i))  are  continuous  on  the  compact 
set  for  all  i,j  n.  Thus,  a  policy  u6Rn  is  of  the  form 


’u(l) 

u(n) 

W 


and  can  take  any  value  in  the  Cartesian  product  x  x  . .  •  x  Un«  A  control 
policy  u  is  usually  chosen  to  meet  some  performance  specification.  In  this 
chapter  we  seek  a  control  policy  that  minimizes  the  infinite  horizon  discounted 
cost  [14,15] 


J  *  E  /  eaiJf(x  ,u(x  ))dc  (5.2) 

-r  O  O 

X  0 

where  a  is  the  discount  rate  and  f(i,u(i))  is  the  instantaneous  cost  of 
being  in  state  i  using  control  u(i).  It  is  assumed  that  f(i,u(i))  is 
continuous  on  lT  ,  for  ial,...,n. 

In  general,  there  are  no  closed  form  solutions  to  this  problem. 

Thus,  iterative  techniques  must  be  used  to  obtain  the  optimal  policy.  These 
methods  may  be  classified  as  either  policy  iterations  [14]  or  value  iterations 
[15.44].  Policy  iterations  usually  require  more  computations  per  iterate. 


however,  they  often  converge  faster  than  value  iterations  [15] .  The  large 
dimension  of  the  generator  in  (5.1)  can  make  policy  iterations  impractical. 

Also,  the  order  of  magnitude  difference  between  — and  B(u)  can  result  in 
ill-conditioned  nonlinear  programming  problems  during  the  minimization  phase 
of  these  algorithms.  In  this  chapter,  we  apply  the  results  of  Chapter  4  to 
obtain  a  near  optimal  policy  with  less  computational  problems. 

In  Section  5.2  we  motivate  our  approach  to  the  problem  by 
decomposing  a  finite  time  discounted  cost  for  a  fixed  policy  into  aggregate 
and  fast  components.  This  enables  us  to  identify  an  "aggregate"  discounted 
cost  problem  that  we  will  use  to  approximate  the  high  order  problem  (5.2). 

In  Section  5.3  we  consider  the  solution  to  the  problem  (5.1)  and  (5.2) 

as  e-*-0.  We  then  establish  in  what  sense  the  resulting  policy  is  near  optimal. 

* 

Finally,  in  Section  5.4,  a  decentralized  algorithm  is  presented  for 
obtaining  a  near  optimal  control  in  a  computationally  attractive  manner. 


5.2.  Decomposition  of  the  Cost  for  a  Fixed  Policv 


When  the  policy  u(x)  is  fixed  and  time  is  finite. 

(5.2)  reduces  to 

t 

J(x  ,t)  ■  E  /  ea°f(x  )dc. 

0  SJ 

(5 

It  is 

X  0 

well  known  [14]  that  J  is  the  solution  of 

j  +  aJ  *  (-  +  B)J  +  f 
e 

(5 

where 

f  -  [f  (1)  ,f  (2) . f(n)jT 

(5 
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and  J(t)  is  an  n-dimensional  column  vector  whose  ith  entry  is  the  cost  incurred 
starting  In  state  i  at  time  t. 

As  In  (4.7),  (4.12),  (4.14)  we  transform  (5.5)  using 


and  obtain 


J  -  VJ, 

n 


j  -  WJ 
Y 


J  +  oJ  -  VBTJ  +  VBSJ  +  Vf 
n  n  n  Y 

ej  +  eaJ  -  eWBTJ  +W(A+eB)SJ  +  eWf. 

Y  y  n  Y 

Since  this  system  Is  In  standard  singular  perturbation  form,  we  can  apply 
Vasileva's  two-time  scale  expansion  procedure  [13], 


(5.6) 

(5.7) 

(5.8) 


J  (t)  -  J  (t)  +  £  (t) 
n  n  n 


j  (t)  -  J  (t)  +  £  (t) 
Y  Y  Y 


(5.9) 

(5.10) 


where  each  term  is  a  power  series  in  e  with  coefficients  depending  either  on 
t,  for  the  slow  ("outer")  series,  or  on  t«  — ,  for  the  fast  ("inner")  series. 


3  (t)  -  3°(t)  +  c3*(t)  + 

n  n  n 

£  (x)  ■  £°(t)  +  e£^(t)  + 

n  n  n 

Jy(t)  -  3°(t)  +  cjJ(t)  + 

£/t)  «£°(t)  +  efj(r)  + 


(5.11) 

(5.12) 

(5.13) 

(5.14) 


Substituting  (5.9)  through  (5.14)  into  (5.7),  (5.3)  and  equating  the  terns 
with  like  powers  in  e,  separately  for  t  and  ~  series,  we  obtain,  for  zeroth 
order  terns 


d 3° ( t ) 

+  *3°(t)  -  vBi3°(t)  *r  vf,  3°(0)  -  j  (x  ,o> 

1  n  ",  **  o 


dt 


(5.13) 


(5.18) 


— Jp— -  WAS£”(t),  JC”CO)  -  Jy(xo,0). 

We  see  Chat  within  0(e)  the  fast  part  of  and  the  slow  part  of  are  zero 
for  all  t.  Furthermore,  due  to  the  asymptotic  stability  of  (5.18)  the  fast 
term  jC°(t) -*■  0  as  t  For  small  e  this  "boundary  layer  term"  is  negligible 

for  all  t>  t,  where  t  is  of  order  -tine.  Thus,  for  t>  t  cost  J  is  approximated 
within  0(e)  by  the  "aggregate"  cost  J^(t)  defined  by  (5.15). 

For  first  order  terms  we  obtain 


dJ  (t)  n  _1  .1 

— +  aj\t)  -  VBTJA(t)  +  VBSJ^t) 
at  n  n  Y 

(5.19) 

djJ(x)  _  ,  , 

—5—  -  VBSjf3 (t)  ,  r(0)  -  -J*(0) 

dx  n  n 

(5.20) 

J1(t)  -  -(WAS)"1(WBTJ°(t) +Wf) 

Y  n 

(5.21) 

+  aX°(x)  -  WASJ^(t)  +  WBS£°(x),  j^(0)  -  -j*(0) .  (5.22) 

Observe  that  at  t"0,  x»0  the  first  order  terms  in  each  series  sum  to  zero. 

Also  observe  that  x ■*■<*>  all  £  terms  tend  to  zero.  Hence,  by  direct  integration 

(5.20)  yields  an  algebraic  expression  for  /(t)  in  terms  of  £°(x) , 

n  Y 

T1(t)  -  VBS(WAS)"1jC°(t)  (5.23) 

V  Y 

that  is,  at  each  stage,  only  separate  fast  and  slow  equations  need  to  be 
solved.  An  important  property  is  that  the  fast  equations  are  "decentralized" 
groups  of  states  due  to  the  fact  that  WAS  is  block-diagonal. 


dr 


From  (5.9),  (5.10),  (5.16),  and  (5.17)  we  have 


J  (t)  -  3°(t)  +  e(jj(t)+jcj(t))  +  0(e2)  (5.24) 

n  n  n  h 

Jy(t)  -  /(t)  +  e(J*(t)  +  /(t))  +  0(e2) .  (5.25) 

Higher  order  terms  can  be  determined  In  an  analogous  manner,  or 
using  techniques  [12,13]  which  are  computationally  more  efficient.  Recalling 
(4.5),  (4.12),  (4.14),  and  (5.6)  we  consolidate  (5.24)  and  (5.25)  into 

J  -  TJ  +  SJ  (5.26) 

n  Y 

which  represents  the  two-time  expansion  of  J  for  all  t.  For  t  large,  t-*-»,  the 
l-terms  vanish  aai  the  equilibrium  (infinite  horizon)  cost  expansion  is 

J  -  TJ°  +  e(TJ1+  SJ1)  +  e2(TJ2  +  SJ2)  +  0(E3)  (5.27) 

n  n  y  n  y 

where  J°-J°,  etc.  are  uniquely  defined  as  the  equilibrium 

solutions  of  (5.15),  (5.19),  (5.21),  etc.  We  see  that  aggregate  cost  TJ°  is 

n 

an  0(s)  approximation  of  the  full  cost  J.  This  fact  serves  as  motivation  for 
the  two-time-scale  algorithm  developed  in  Section  5.4. 


5.3.  Near  Optimal  Control 

The  infinite  horizon  discounted  cost  problem  (5.1)  and  (5.2)  is 
well  defined  and  can,  in  principle.be  solved  using  any  one  of  a  number  of 
methods  [15,44,53].  In  all  these  methods,  the  computational  burden  can  be 
prohibitive  for  large  chains.  However,  in  singularly  perturbed  models 
to  reduce  computations  and  improve  convergence  of  design  algorithms  [7],  a 
reduced  or  unperturbed  (e*0)  problem  is  solved.  The  resulting  policy 
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obtained  is  termed  "near  optimal.”  In  this  section  we  define  the  "reduced 
problem"  and  in  what  sense  the  resulting  policy  is  near  optimal. 

For  the  purpose  of  intuitively  understanding  the  near  optimality 
results,  let  us  first  consider  a  special  case  of  the  more  general  problem 
outlined  in  Section  5.1.  Assume  that  the  control  parameter  u  is  a  scalar  and 
that  u  can  take  on  any  value  in  a  segment  Uq.  The  Hamilton  Jacobi  equation 
for  this  problem  is 

0*  min  { (— u~  +  B°  (u) ) J  +  f (u) }  *  min{^(u,e)}  (5.28) 

uEU  6  u£U 

o  o 

where 

f(u)  -  [f(l,u(l)),...,f(n,u(n))]  (5.29) 

and  Ba(u)  -  B(u)-al.  We  now  approximate  the  full  optimal  control  problem 
(5.28),  (5.2)  for  e  >  0  by  a  simpler  problem  defined  at  e»0.  In  (5.20)  we 
cannot  let  e»0,  but  if  we  substitute  (5.27)  into  (5.28), 

0  -  min{A(u)SJ1  +  Ba(u)TJ' +  f(u) +  e[A(u)SJ2  +  B°t(u)TJ1  +  Ba(u)SJ1] 
u£U  Y  n  Y  Y  Y 

+  0(e2)}  (5.30) 


we  obtain  at  e  » 0  the  "reduced"  optimality  condition 


0  -  oin{A(u)SJ1  +  Ba(u)TJ°  +  f(u)}. 
u6U  Y  n 


(5.31) 


We  denote  a  control  minimizing  (5.31)  by  u  .  To  avoid  technicalities  we 

n 

assume  that  the  derivatives  A^,  B^,  and  f  of  A(u),  B3(u),  and  f(u)  are 

continuous  and  that  the  unique  minima  for  each  row  in  (5.31)  are  reached  in 

the  interior  of  U  ,  that  is  u  satisfies 
o  n 


A  (u  )SJX  +  B^u  )TJ°  +  f  (u  )  -  0. 
uny  unn  uri 


(5.32) 
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Note  that  with  u  given,  the  substitution  into  (5.31)  yields 


A(u  )SJX  +  Ba(u)TJ°  +  f(u  )  -  0 

n  y  n  n 


(5.33) 


which,  when  multiplied  by  V(u  )  and  W,  respectively,  results  in 


V(u  )Ba(u  )TJ°  +  V(u  )f(u  )  -  0 
n  n  n  n  n 

WA(u  )SJ1  +  WBa(u  )TJ°  +  WF(u  )  -  0. 

n  y  t)  n  n 


(5.34) 


(5.35) 


Conditions  (5.33),  (5.34),  and  (5.35)  uniquely  determine  u  ,  TJ  ,  and  SJX .  To 

n  n  y 

see  in  what  sense  u^  is  near-optimal,  we  represent  the  optimal  control  u*  for 
(5.28)  in  the  form 


u*  *  u°  +  eu^-  +  0(e2). 


(5.36) 


Then  we  substitute 


A(u*)  ■  A(u  )  +  sAu(u  )u  +  0(e  ) 


(5.37) 


and  similar  expressions  for  Ba(u*)  and  f(u*)  into  (5.33),  that  is  into 


0  -  A(u*)SJ1  +Ba(u*)TJ°*+f(u*)  +  e[A(u*)SJ2*+Ba(u*)(TJ1*+SJ1*)] 
Y  n  y  n  y 

+  0(e2) 


(5.38) 


and  obtain 


0  *  A(u°)SJ1  +  Ba(u°)TJ°*  +  f (u°)  +  e{  [A  (u°) SJ1*  +  B3 (u°)TJ°* +  f  (u°)]u1 
Y  T)  uyuriu 


+  A(u°)SJ2%Ba(u°)(TJ1*  +  SJ1*)}  +0(e2)  =  u(e) 
YU  n  Y 


(5.39) 


;vTe  see  from  (5.31),  (5.33),  and  (5.39)  at  e  ■  0  that  u°  =  u  and,  hence, 
o  ^ 

and  i  are  determined  by  u  ■  u  .  Next  we  note  that  the  £-term  in  (5.39)  is 

h  n 

zero  for  all  s>0  because  u*  is  optimal  for  all  t>0.  This  term  involves  the 

unknown  first  order  optimal  control  term  u^.  However,  by  (5.32)  the  expression 

1  * 
multiplying  u  is  zero.  Thus  the  cost  terms  SJ"  and  TJ^  are  uniquely 


determined  by  u°.  This  establishes  that  when  u  is  found,  the  series  (5.27) 

2 

for  the  optimal  cost  is  matched  up  to  0(e  ).  Thus  u^  is  near  optimal  in  the 
sense  that 

J(u*)  -  J(u  )  -  0(e2).  (5.40) 

This  property  that  each  term  in  the  optimal  control  series  matches  two  terms  in 

the  optimal  cost  series  is  found  in  other  control  problems  [58]. 

For  the  more  general  problem  posed  in  Section  5.1,  it  can  be  shown 

that  when  u  is  unique,  (5.40)  holds  without  the  differentiability  require- 
n 

ments  on  A(u) ,  Ba(u),  and  f(u)  needed  in  (5.37).  This  has  been  shown  in  [25] 
based  on  the  theory  of  extremum  in  quasi-differentiable  functions  [60].  The 
basic  result  of  [60]  used  in  [25]  is  that 


3u(e) 

3e+  ! e»0 


3l( u  ,e) 
3e+ 


e-0 


A(u°)SJ2  +  Ba(u°)(TjJ  +  SJ*)  (5.41) 


where  u(e)  and  <£(u,e)  are  defined  in  (5.39)  and  (5.28),  respectively.  (5.41) 
shows  that  the  0(e)  coefficient  of  the  expansion  in  (5.39)  is  dependent  only 
on  u°.  Thus,  J2,  and  are  determined  uniquely  by  u°.  For  the  special 
case  assumed  in  the  above  deviation  we  showed  this  explicitly  by  observing 
that  in  (5.39)  the  only  nonzero  part  in  the  coefficient  of  e  was  given  by 
(5.41). 

When  the  reduced  policy  u^  is  not  unique,  then  (5.41)  may  not  hold 
and  we  can  only  guarantee 


J (u  )  -  J*  -  0(e) , 
n 


(5.42) 


although  in  practice  we  could  expect  much  greater  accuracy.  The  importance 
of  the  reduced  problem  will  now  be  used  to  develop  a  decentralized  optimiza¬ 
tion  algorithm  for  the  infinite  horizon  discounted  cost  problem. 
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5.4.  Decentralized  Optimization 


In  this  section  a  decentralized  algorithm  is  presented  for  obtaining 
a  near-optimal  control  policy  to  the  singularly  perturbed  controlled  Markov 


chain  problem  of  Section  5.1.  Since  the  algorithm  is  based  on  value 
iterations,  it  is  more  convenient  to  work  with  the  discrete-time  version  of 
(5.1),  namely 


p(k+l)  -  p(k)(^l  +  B(u)  +  I) 


(5.42) 


with  discounted  cost 


J(x  )  -  min  E  E  p  f(jt  ,u(:c)) 
0  uSU  x  k*0  *  * 


(5.43) 


where  0<  p  <  1.  The  optimality  condition  becomes 


J*  -  min  p{^2.  +  B(u)  +  x)j*  +  f(u)}. 
u£U  £ 


(5.44) 


J*  is  the  n-dimensional  optimal  cost  vector  where  J*  is  the  cost  if  the 
process  starts  in  state  i,  for  i«l,...,n. 

For  any  given  policy  cost  J  can  be  found  by  solving  the  set  of  n 
linear  equations 


J  *  p[(~+  B  +  I)J  +  f] 


(5.45) 


which  we  can  rewrite  as 


I+VBT 


WBT  W(4  +  B+  I)S  J 


(5.46) 


j  r 


where  J  *VJ,  J  ■  MJ  as  in  (5.6)  .  Substitution  of  J  =  J°-0(c)  and  J  =J°  +  0(-:) 
n  y  nr  Y  Y 


into  (5.46)  gives  ■  0  and 

J°  -  p  (I  +  VBT) J°  +  pVf  (5.47) 

n  n 

and  hence,  - 

J  -  TJ°  +  0(e)  (5.48) 

n 

which  agrees  with  (5.27).  The  form  of  the  aggregate  cost  equation  (5.47) 
suggests  that  J°  can  be  considered  as  the  optimal  cost  J*  for  an  aggregate 
problem  whose  optimality  condition  is 

J*  -  min  p  [ (I  +  V (u)B(u)T)  J*  +  V(u) f  (u)  ] .  (5.49) 

n  MB] 

An  optimization  algorithm  known  as  the  Jacobi  iterations  [15,44],  when  applied 
to  (5.44)  and  (5.49)  is,  respectively 


Jk+1  -  min  p{(A£hL  +  B(u)  +  I)Jk+ f(u)}  (5.50) 

e 
u 

k+1  W 

J  -  min  p{(I  +  V(u)B(u)T)J  +V(u)f(u)}.  (5.51) 

n  u  n 

In  the  aggregate  problem  (5.51)  the  dimension  is  reduced  from  n  to  N,  but  a 
difficulty  is  that  it  is  not  obvious  how  the  control  obtained  in  (5.51)  depends 
on  the  original  states.  To  avoid  this  difficulty,  we  rewrite  (5.51)  in  the 
form 

k+1  k 

J  -  min  p{V(u)  [(I  +  B(u))TJ  +  f(u)]},  (5.52) 

n  u  h 

and  interpret  the  term  in  the  brackets  as  the  cost  g  (u)  of  an  average  cost- 
per-stage  problem.  It  is  an  n-column  vector  which  can  be  partitioned  into  N 
si. 
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described  by  decoupled  fast  chains  A^ (u^ )  where  denotes  controls  for^ states 

in  j-th  group.  The  solutions  for  the  average-cost-per-stage  problem  for  each 

fast  chain  exist  under  the  ergodicity  assumption  on  A  (u^).  They  can  be 

J 

found  using  algorithms  such  as  [54,55].  Then  (5.52)  is  rewritten  in  a 
decentralized  form 


Jk+1  ■  min  p[v^(u^)gk(u^)] 


•  i3) 


'j 


u 


for  each  group  j*l,...,N.  Therefore,  if  at  step  k  a  "coordinator”  obt<- 

the  results  of  (5.53)  calculated  locally  in  each  group,  its  role  is  to  co..- 

k+1 

solidate  the  result  in  the  form  of  J  .  This  information-  is  then  used  to 

n 

k+1 

form  the  new  fast  cost  g  (u)  according  to 


gk+1(u)  -  [I  +  B(u)]TJk+1+f(u). 

n 


(5.54) 


Graphically,  this  algorithm  has  the  decentralized  structure  in  Figure  5.1. 

The  aggregate  Markov  process  assumes  that  the  "fast  chains"  have 

reached  their  steady  state  probabilities  V ^  .  Each  aggregate  iteration  (5.53) 

is  in  fact  an  infinite  horizon  problem  for  the  "fast  chains."  These  infinite 

horizon  minimizations  are  in  the  form  of  N  separate  average  cost  per  stage 

k  i 

problems  with  respect  to  the  costs  g  (u  ) ,  j*l,...,N.  These  costs  contain 
not  only  the  instantaneous  subsystem  cost  f(u^),  but  also  the  cost  contribu¬ 
tions  due  to  coupling  to  other  subsystems.  It  is  the  latter  that  enables  the 
fast  problems  to  be  solved  independently.  Other  iterative  algorithms,  such  as 
Gauss-Seidel  [15]  can  be  decentralized  in  a  similar  fashion. 

We  can  now  show  that  the  limiting  (k-*-®)  policy  u„  in  (5.5  3) 
satisfies  (5.31)  with  thus  establisheing  the  near  optimality  of  u  . 


i 


Lemma  5.1:  The  policy  obtained  in  (5.53)  as  k-*”a  satisfies  the  reduced 


optimality  equation  (5.31).  Thus,  if  the  policy  is  unique,  then 

J(u  )-J*  *  O(e^) . 

n 

Proof;  From  [55],  we  can  express  the  limit  in  (5.53)  as  N  average  cost  per 
stage  optimality  equations  of  the  form 


J  1  =*  min  p (A . (u^)c  +  g(u^)) 
^4  j  J  3 

J  uJ 


min  p  ( (u^  )  c..  +  B_.  ^  (u^  )  T^ 

UJ  1 


(5.55) 

+  B  (uj)T  J  +T.J  +f(uj))  (5.56) 

JN  N  11  jj  J  j 


or  in  vector  form 


TJ  =  min  p (A(u)c + B(u)TJ  +  TJ  +f(u))  (5.57) 

n  u  n  n 

where  c6Rn  are  dual  variables  as  defined  in  [14,55].  When  u*^,  (5.57) 
becomes 


TJ  =■  p  (A(u  )c  +  B(u  )TJ  +TJ  +  f(u  )).  (5.58) 

n  n  n  n  n  n 

Premultiplying  by  W  we  obtain 

0  *  WA(u  ) c  +  WB (u  )TJ  +  Wf(u  )  (5.59) 

n  n  n  n 

which,  from  (5.35)  uniquely  defines  c  as  SJ^.  Thus,  letting  p*-j-—,  a>0* 
(5.57)  takes  the  form 

aTJ°  =  rain{A(u)SJ1  + B(u)TJ°+ f(u)}  (5.60) 

h  u  Y  n 

which  is  equivalent  to  (5.31),  thus  from  Section  5.3,  the  O(t^)  optimality  of 
the  cost  using  policy  u^  is  established.  In  concluding  this  chapter,  several 
useful  properties  of  this  algorithm  should  be  cited. 

1.  No  system  of  linear  equations  of  any  order  need  ever  be  solved. 
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r.. 


B 


f 


2.  Storing  "aggregate"  iterations  of  the  form  3T  rather  than  just 
3  reduces  the  number  of  computations  and  memory  requirements . 

3.  The  algorithm  is  independent  of  the  small  parameter  e.  Thus, 
in  general  the  algorithmic  stiffness  of  the  high  order  problem 
has  been  removed. 

4.  Fast  chains  perform  their  local  minimizations  in  parallel.  Fast 

costs  are  updated  on  the  slow  time  scale  by  receiving  only 
k+1  N 

J  £R‘  from  the  aggregate  coordinator. 


I! 

K 


t 


103 


CHAPTER  6 

THE  AVERAGE  COST  PER  STAGE  PROBLEM  FOR  CONTROLLED  MARKOV  CHAINS 

6.1.  Introduction 

In  many  controlled  Markov  chain  problems  the  discounted  cost  is 
not  a  physically  meaningful  measure  of  system  performance.  In  such  cases  the 
average  cost  per  stage  [55,44,15]  is  commonly  used.  In  discrete  time,  this 
cost  function  takes  the  form 

1  T 

J  *  min  lim  rr^r  E  Z  f(x,  ,u(x,  ))  (6.1) 

uEU  T-*»  1+T  x  k 

where  U  and  f(i,u(i))  are  defined  as  in  the  discounted  case. 

In  this  chapter  we  again  use  the  two-time-scale  property  of  (5.42) 
to  develop  a  computationally  efficient  algorithm  for  obtaining  a  near  optimal 
control  policy  u  for  the  problem  defined  by  (5.42)  and  (6.1). 

In  Section  6.2  the  optimality  conditions  for  the  average  cost  per 
stage  problem  are  preserved  and  the  cost  for  a  fixed  policy  decomposed  into 
"fast"  and  "aggregate"  components. 

In  Section  6.3,  we  consider  the  solution  to  the  problem  (6.1)  as 
c-0  in  (5.42).  This  results  in  a  reduced  problem  for  which  a  near-optimal 
policy  is  obtained. 

In  Section  6.4  a  decentralized  algorithm  is  developed  for  obtaining 
a  near  optimal  policy  in  a  computationally  attractive  manner. 

Finally,  in  Section  6.5,  an  example  is  given  illustrating  the  decen¬ 


tralized  algorithm. 


c 


Decomposition  of  the  Cost  for  a  Fixed  Polic 


In  this  section  we  assume  that  the  control  policy  u  is  fixed  and  we 
wish  to  obtain  the  corresponding  cost  (6.1).  By  using  the  two-time-scale 
property  of  (5.42)  we  will  attempt  to  decompose  the  cost  J  into  "fast"  and 
"aggregate"  components.  This  will  lead  us  ot  the  concept  of  the  dual 
variables  c€  Rnt  and  their  decomposition  into  "fast"  and  "aggregate" 
components.  These  results  will  be  of  importance  in  the  next  two  sections. 

Under  the  assumptions  of  Chapter  4  and  Section  5.1,  (6.1)  can  be 
written  in  the  convenient  form 


J  ■  min  p(u)f(u) 
dEU 


(6.2) 


where  f(u)  is  defined  in  (5.29)  and  p(u)  is  the  n-dimensional  row  vector  of 
steady  state  probabilities  defined  by 

p(u)  -  p(u)  +  B(u)  +  I) 


0  -  p(u)<^-  +  B(u) ) 


(6.3) 


E  p, (u)  ■  1. 
i-1  i 


(6.4) 


Note  that  unlike  in  Chapter  5,  here  J  is  independent  of  initial  state  and  is 
thus  a  scalar.  Hence  the  optimal  cost  per  stage  is  the  same  for  all  states. 

Assume  that  there  are  only  a  small  number  of  policies  from  which 
to  choose.  In  this  case,  it  may  be  simpler  to  evaluate  (6.2)  for  each  policv 
and  pick  the  minimum.  Unfortunately,  (6.3)  usually  represents  a  large  set 
of  ill-conditioned  linear  equations.  However,  as  we  have  shown  in  Chanter  4, 
5(u)  can  be  written  in  the  form 


P(u)  =  n(u)V(u)  +  y ( u ) W . 


(  n .  3  > 


LOS 


Then,  for  a  given  policy,  (6.2)  takes  the  fora 


J  -  nVf  +  yWf 


nf  +  yt 
n  y 


(6.6) 


where  n  and  y  can  be  computed  efficiently  using  the  asymptotic  series  solution 
developed  in  Chapter  4. 

As  an  alternative  to  this  approach,  consider  the  optimality  equation 
for  the  average  cost  per  stage  problem 


J*1  ■  min  {  — +  B (u) ) c*  +  f  (u)  } 


(6.7) 


where  J*  is  the  optimal  average  cost  per  stage  and  c*€  Rn  the  optimal  dual 
variables  as  defined  in  [55].  Given  a  policy  we  could  solve  [14] 


Ji  -  +  B(u))c  +  f (u) 


(6.8) 


for  J  and  c  and  pick  the  policy  with  minimum  J  as  optimal  if  the  number  of 


policies  is  small.  Again,  to  avoid  solving  this  usually  large  set  of  possibly 


ill-conditioned  linear  equations  (6.8)  we  premultiply  (6 


V 

.8)  by  . 

L  w  J 


Define 


(6.9) 


and  obtain 


eWBT  W(A+eB)S  I  I  C 


(6.10) 


Expanding  J,  and  in  asymptotic  power  series  of  the  form 


- .  •  ' . 


J  -  J  +  eJ.  + 
o  1 

C  -  C°  +  eC1  + 

n  n  n 

C  -  C°  +  eC1  + 

Y  Y  Y 


(6.11) 


(6.12) 


(6.13) 


we  obtain  C  «0,  and  the  remaining  terms  defined  by 


J  1  -  VBTC  +  Vf 
o  —  n 


(6.14) 


C1  -  -(WAS)-1(WBTC°  +  Wf) 
Y  n 


(6.15) 


'J,  1  -  VBTC1  +  VBSC1 
1  —  n  Y 

C2  -  -(WAS)~1(WBTC1  +  WBSC1) 
Y  n  Y 


(6.16) 


(6.17) 


J,  1  -  VBTCk  +  VBSCk 
k  —  n  Y 

Ck+1  -  -(WAS)_1(WBTCk  +  l^BSCk)  . 
Y  r|  y 


(6.18) 


(6.19) 


(6.14)-(6.19)  can  be  solved  on  a  reduced  order  basis.  Note  that  (6.14),  (6.16) 
(6.18)  have  more  unknowns  than  linear  independent  equations.  One  way  around 
this  problem  as  recommended  in  [14]  is  to  set  one  of  the  C  's  equal  to  zero 

k  *• 

for  all  k.  Once  this  is  done  the  remaining  C  's  and  J,  can  be  calculated 

ni  k 

uniquely. 

k  k+1 

Once  Cn  is  found,  Cy  is  uniquely  determined.  This  property  of  non- 

uniqueness  of  is  to  be  expected  since  if  C*  is  the  optimal  dual  variable  in 

(6.7),  so  is  C*+5JL,  Vo€R.  Thus,  if  we  follow  the  above  procedure  in 
ic  Ic 

computing  J  ,  C  ,  and  C,  ,  we  will  obtain  the  unique  average  cost  per  stage 

k  n  Y 

given  by  (6.11)  and  the  dual  variables  C  given  by 


:c  +  SC 


(6.20) 


,  s 

.V 

cd 
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will  be  unique  Co  within  an  additive  vector  of  the  form  5  1^  <5  6R.  Using 
either  (6.6)  or  (6.10)  we  can  efficiently  compute  the  avarage  cost  per  stage 
for  a  fixed  policy. 

Before  concluding  this  section,  it  is  interesting  to  note  that 
(6.14)  is  of  the  form  of  an  "aggregate"  average  cost  per  stage  equation  with 
transition  matrix  I  +  VBT  and  instantaneous  cost  Vf.  This  idea  will  be  of 
significance  in  Section  6.4. 


n 

I 


6.3.  Near  Optimal  Control 

The  problem  defined  by  (6.1)  and  (5.42)  can,  in  theory,  be  solved  by 
a  number  of  different  methods  (14,54,55].  However,  analogous  to  Section  5.3 
for  the  discounted  problem,  we  wish  to  take  advantage  of  the  singularly 
perturbed  model  of  (5.42)  to  obtain  a  reduced  or  unperturbed  (e“0)  problem. 

In  this  section  we  define  the  "reduced"  average  cost  per  stage  problem  and 
establish  in  what  sense  the  resulting  policy  is  near  optimal.  The  results 
of  this  section  are  obtained  in  much  the  same  way  as  the  near  optimality 
results  of  Section  5.3  for  the  discounted  cost  problem.  Therefore,  to  avoid 
redundancy,  many  references  will  be  made  to  that  section. 

Again,  for  intuitive  understanding  consider  first  the  special  case 
where  u  is  a  scalar  and  takes  on  values  in  a  segment  Uq.  The  optimality 
equation  for  this  problem  can  be  written  in  the  form 


0  ■  min  { +  B(u))C  +  f  (u)-J  1} 

uEU  S 
o 


min  (y?  (u,s) } 
u£U0 


(6.21) 


where  f(u)  is  defined  in  (5.29).  For  any  policy,  C  and  J  have  the  form  (6.11) 
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and  (6.20).  Substituting  these  forms  Into  (6.21)  we  obtain 

0  -  oin{A(u)SC1  +  B(u)TC°  +  f(u)-J°l 
uEU,  Y  n  " 

+  e (A(u) SC2  +  B(u) (TcJ  +  JC*) -J1!) +  0 (e2) }  . 


(6.22) 


Thus,  the  optimality  condition  for  the  reduced  problem  becomes 


0-  min  (A(u)SC1  +  B(u)TC°  +  f (u)-J°l> . 

uEU  Y  n 

o 


(6.23) 


Again  denote  the  control  minimizing  (6.23)  as  u  .  Assume  that  the  derivatives 

A  ,  B  ,  and  f  are  continuous  and  that  the  unique  minima  for  each  row  in 
u  u  u 

(6.23)  are  reached  in  the  interior  of  U  ,  thus 

o 


A  (u  )SC1  +  B  (u  )TC°  +  f  (u  )  -  0 
uriy  unn  u  n 


(6.24) 


and  with  u  given,  we  obtain 


J°1  =■  A(u  )SC*  +  B(u  )TC°  +  f(u  ) 

—  n  y  n  n  n 


(6.25) 


Premultiplying  (6.25)  by  V(u  )  and  W  results  in 


J°1  *  V(u  ) B(u  )TC°  +  V(u  )f(u  ) 

—  non  n  n 

0  »  WA(u  )SC1  +  WB(u  )TC°  +  Wf(u  ) 

n  y  n  n  n 


(6.26) 


(6.27) 


which  are  in  the  form  of  (6.14)  and  (6.15)  and  hence  can  be  solved  uniquely 


for  J°,  SCT,  and  TC°.  To  see  in  what  sense  u  is  near  optimal,  let  the 

Y  n  n 

optimal  control  of  (6.21)  u*,  have  the  series  form  (5.36).  Then  substitute 
(5.37)  and  analogous  expressions  for  B(u*)  and  f(u*)  into  (6.22)  and  obtain 


r: 


* 


a 

■ 


B 

I 
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0  -  A(u°)SC1  +  B(u0)IC°+f(u0)  -  J°l+e([A  (u°)SC1+  B  (u°)TC°+f  (upju1 
y  n  —  u  Yu  nu 

+  A(u°)SC2  +  B(u°)(TC1+SC1)  -  J1!) +0(e2)  -  p(e).  (6.28) 

y  n  y  ~ 

o  O*  o* 

From  (6.25)  and  (6.23)  at  e«0,  u  -u  and  thus  J  ,  C  ,  and  C  are  deter- 

n  n  Y 

mined  by  u  .  Next,  note  that  due  matching,  the  e-term  in  (6.28)  is  zero 

n 

Ve  >  0  since  u*  is  optimal  Ve>0.  This  term  involves  the  unknown  first  order 

optimal  control  term  u^.  However,  the  expression  multiplying  u1  is,  by  (6.24), 

1*  2*  1* 

identically  zero.  Thus,  J  ,  SC^  ,  and  TC^  are  obtained  from 

JX1  -  A(u°)SC2  +  B(u°)(TcJ+SC*)  (6.29) 

and  hence  are  uniquely  defined  by  u° *  u^ .  Therefore,  when  u°  is  found,  the 

series  for  the  optimal  cost  (6.11)  and  optimal  dual  variables  (6.12)  and 

2 

(6.13)  are  matched  up  to  0(e  ).  Thus  u^  is  near  optimal  in  the  sense  that 

J(u  )  -  J*  -  0(e2)  (6.30) 

n 

C(u  )-  C*  -  0(e2), 
n 

For  the  more  general  problem  posed  in  Section  6.1,  it  can  be  shown  [60]  that 
when  u^  is  unique  (6  30)  and  (6.31)  hold  without  the  differentiability  require¬ 
ments  on  A(u) ,  B(u) ,  and  f(u).  This  discussion  is  given  in  Section  5.3  and 
will  not  be  repeated  here. 


6.4.  Decentralized  Optimization 

In  Section  6.2,  the  cost  for  a  fixed  policy  was  decomposed  into 
"aggregate"  and  "fast"  components.  This  enabled  reduced  order  computations 


for  finding  the  cost.  Thus,  for  a  small  number  of  policies,  the  cost  for  each 
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policy  computed  and  the  optimal  chosen.  When  the  number  of  policies  is  large 
or  infinite,  an  exhaustive  search  for  the  optimal  policy  is  impractical  and 
one  of  many  iterative  policy  improvement  algorithms  must  be  used  [14,54,55]. 

In  this  section  we  present  a  decentralized  algorithm  for  computing  a  near 
optimal  policy  to  the  average  cost  per  stage  problem  posed  in  Section  6.1. 

The  structure  of  the  algorithm  is  similar  to  that  of  Section  5.4  for  discounted 
problem  and  hence  offers  the  same  computational  advantages. 

For  every  policy  u€  lT,  p(u)  can  be  written  in  terms  of  the 
"aggregate"  and  "fast"  states  as  in  (6.5).  Substituting  (6.5)  into  (6.2)  we 
obtain 

J  -  min(n(u)V(u)f(u) +y(u)Wf(u)}.  (6.32) 

u6U 

However,  y(u)  is  uniquely  defined  by 

y(u)  *  -en(u)V(u)B(u)S(u) (W(A(u)  +  eB))S(u))  1 

-  -eF(u).  (6.33) 

Thus,  (6.32)  takes  the  form 


where 


J  *  min{n (u) f (u) } 
u£U 

f(u)  *  (V(u)-cF(u)W) f (u)  . 


(6.34) 


Thus,  the  control  problem  has  been  reformulated  in  terms  of  the  aggregate 
states  "SR  .  For  a  given  policy  we  may  compute  J  from  (6.34)  or  by  solving 
the  following  set  of  linear  equations  for  J  and  the  aggregate  dual  variables 
C  =  R" . 


(6.35) 


J  1  -  (V(u)-sF(u)W)B(u)TC  +  f(u) 

~  h 

where  (V(u)-eF(u)W)B(u)T  is  the  perturbed  aggregate  generator.  Then  the 
optimality  equation  for  the  aggregate  problem  (6.34)  becomes 

J  1  *  min  { (V(u)-eF(u)W)B(u)TC  +  t(u)}.  (6.36) 

"  u€U*  n 

It 

Assume  now  that  for  an  arbitrary  policy  u  we  have  obtained  the  corresponding 

]£ 

aggregate  dual  variables  C^.  In  most  iterative  algorithms  [14,59,55],  to 
improve  the  policy  u  ,  the  following  minimization  is  required 

h(Ck)  -  min{(V(u)-eF(u)W)B(u)TCk+f(u)}.  (6.37) 

n  UEU  n 

In  the  limit  as  k-*-®,  h(C^) -*■  J  1^.  The  elements  of  the  aggregate  generator  and 
instantaneous  cost  are  complex  nonlinear  functions  of  the  controls  for  many 
of  the  original  states  and  the  minimization  in  (6.37)  is  difficult  to  carry 
out  if  not  impossible.  Thus,  we  have  saved  little  by  solving  the  low  order 
aggregate  problem  (6.36)  as  opposed  to  (6.7)  unless  a  further  simplification 
is  made.  By  letting  e-^O  in  (6.37)  we  obtain 

h(Ck)  -  min{V(u)B(u)TCk  + V(u)f(u)}.  (6.38) 

n  ueu  n 

As  expected,  this  optimization  problem  is  equivalent  to  one  we 
obtain  if  we  were  interested  in  minimizing  only  the  Oth  order  aggregate  cost 
term  in  (6.14).  In  other  words,  in  (6.38)  we  are  seeking  a  near  optimal  policy 
chat  satisfies  the  reduced  optimality  equation 


J 


mi.i{V(u)B(u)TC 


n 


+  V(u) f (u) } . 


(6.39) 


The  analogy  with  the  discounted  problem  of  Section  5.4  should  now  be  clear. 
We  proceed  to  interpret  the  minimization  operation  in  (6.38).  The  need  to 


L12 


know  V(u)  as  a  function  of  u  and  the  Inability  to  disaggregate  the  policies 
obtained  in  (6.38)  back  to  the  original  states  makes  the  minimization  of 
(6.38)  Impractical  in  its  present  form.  This  difficulty  can  be  avoided  by 
rewriting  (6.38)  in  the  form 


h(Ck)  -  min{V(u)[B(u)TCk+f(u)]}. 
n  u6U  n 


(6.40) 


As  in  the  discounted  problem,  partition  u  and  f(u)  according  to  the 
fast  subsystems 


r  ii 

u 

f(uL) 

u  ■ 

2 

u 

f(u)  - 

f(u^) 

• 

• 

• 

• 

• 

N 

u 

N\ 

_f(u  )_ 

(6.41) 


To  more  clearly  see  the  form  of  the  vector  B(u)TC  +  f (u)  we  note  that  for  a 

n 

three  class  system  it  takes  the  form 


8u(u1)tic!J1  +  +  C13(,‘1>T3<3  +  £("X) 

B21(“2>T1CH1  +  B22(“2)T2<2  +  B22("2>T3<3  +  f("2> 
.B31(“3)TlCn1  +  B32(“3)T2Cn2  +  B33('i3)T3<3  +  f<“3>. 


(6.42) 


It  is  crucial  that  each  row  i  of  (6.42)  is  a  function  of  only  one  control 
u(i),  i*l,...,n.  We  can  interpret  each  row  of  (6.42)  as  a  cost  g  (u(i)) 


depending  on  a  single  control  for  a  given  C  .  If  we  partition  this  cost 


according  to  the  fast  subchain  dimensions  we  obtain 
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gk(u) 


k,  1. 
g  (u  ) 


k.  2, 
g  (u  ) 


k ,  N. 

g  (u  ) 


(6.43) 


This,  (6.40)  takes  the  form 


h(C^)  ■  min{V(u)gk(u)> , 
n  u6U 


(6.44) 


However,  since  V(u)  Is  block  diagonal,  (6.44)  can  be  written  as 


h ,  (Ck)  -  min  (V  (u1)g5t(u1)} 
in  i  i  A 

uSU 


(6.45) 


h  (Ck)  -  min  (V  (uN)gk(uN)}. 

N  n  N  rTN  " 
u  €U 


(6,46) 


This  formulation  points  out  the  possibility  of  decentralization,  because 


(6.45)-(6.46)  are  average  cost  per  stage  problems  for  the  fast  chains  A. (u  ) 


k  i 

with  respect  to  the  instantaneous  cost  g  (u  ),  i*l,...,N  which  can  improve 


k,  i, 


the  controls  at  a  "local"  subsystem  level  using  "local"  costs  g  (u  ) .  The 


lc  i 

g  (u  )'s  are  updated  on  a  slower  time  scale  at  the  aggregate  level  which 


k+1 

assigns  the  new  C 


k  N 

Once  the  vector  h(C  ) €  R  is  computed,  different  algorithms  [14,54,55 

n 


k+1 


use  different  means  to  update  the  dual  variables  C  .  A  recently  developed 


algorithm  for  this  problem  is  due  to  Varaiya  [55].  We  now  use  this  algorithm 
to  illustrate  the  hierarchical  structure  of  the  decentralized  optimizations 


(6.45)-(6.46) . 


The  first  thing  we  must  establish  is  that  the  problem  we  are 
attempting  to  solve  is  a  well  defined  average  cost  per  stage  problem 
satisfying  all  the  conditions  necessary  for  Varaiya's  algorithm  to  converge  to 
the  optimal  policy.  The  problem  we  are  trying  to  solve  is  summarized  in  (6.39). 
First  note  that  I + V(u)B(u)T  is  a  well  defined  transition  matrix  whose  elements 
depend  continuously  on  u£U.  Second,  under  the  ergodicity  assumptions  on 
(5.42),  I  +  V(u)B(u)T  contains  a  single  ergodic  class  Vu€  U.  And  finally, 
V(u)f(u)  is  a  well  defined  vector  of  instantaneous  costs  who§e  elements  depend 
continuously  on  the  vector  u.  Thus,  the  problem  satisfies  the  conditions  in 
[55]  necessary  for  the  algorithm  to  converge  to  the  optimal  policy.  We  now 
review  Varaiya's  algorithm  for  the  high  order  problem  (6.7). 

Given  an  arbitrary  C°,  define  h(C°)  as  the  vector  of  values  resulting 
from  the  n  pointwise  minimizations 

h(C°)  -  min{  (^1  +  B(u))C°+  f(u)}.  (6.47) 

ueu  e 

The  algorithm  then  proceeds  as  follows:  Let 

h(C°)  -  max  h. (C°)  (6.48) 

i  1 

h(C°)  -  min  h.(C°)  (6.49) 

i  1 

and  find  *  C° + Atf (C°)  according  to  a  discretization  of  the  differential 
equation  [56] 

~  -  h(c)  --  [h(c)Tl]l  =  f(c).  (.6.50) 

at  n  —  — 

Then  is  used  to  obtain  h(C^)  as  in  (6.47)  and  the  cycle  continues. 

Convergence  is  monotonic  and  can  be  "measured"  since  h (CK)-h (C^)  -  0  ad 
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We  now  apply  this  algorithm  to  solve  (6.79).  For  a  given  initial 


C°,  define  h(C°)€R^  as 
n  n 


where 


h(C°)  -  min{V(u)g°(u)} 
71  ueu 


g°(u)  *  B(u)TC°  +  f(u). 


(6.51) 


(6.52) 


In  component  form  (6.51)  becomes 


h..(C°)  *  min  {V1  (u1)g°(u1)} 

n  11 
u  eu 


(6.53) 


h  (C°)  ■  min  {V  (uN)g°(uN)}. 
N  n  «  «  a 


uV 


(6.54) 


Each  "fast"  chain  average  cost  per  stage  minimization  in  (6 .53)— (6 .54)  can 
be  solved  in  parallel  using,  for  example,  Varaiya's  algorithm.  The  results 
of  the  minimizations  gives 


h  (C°)  _1  =  rain  (A.  (u1)C1  +  g°(u'L)  } 
in  i  i  A 

u  €U 


(6.55) 


hN(C°)  1  -  rain  {AN(uN)CN  +  g°(uN)) 
uN€UN 


n. 

where  C  €  (R  ,  i  =  l,...,N  are  the  dual  variables  during  the  fast  optimizations 
Note  that  h.(C°)  is  the  optimal  average  cost  per  stage  for  fast  chain  i  under 


instantaneous  cost  g  (u  ),  for  i  =  l,...,N. 


h(C°)  =  max  h(C°) 
n  i 


(6.56) 


h(C°)  =  rain  h(C°) 
~  n  i  n 


(6.57) 
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and  find  ■  C°  +  At  f(C°)  according  to  a  discretization  of  the  differential 
n  n  n  n 

equation 

dc  1 

-rr  -  h(C  ) [h(C  )  1]1  *  f(C  ).  (6.58) 

dt  n  n  n -  n 

Use  to  update  the  fast  cost 
n 

gX(u)  -  B(u)TcJ  +  f(u).  (6.59) 

Intuitively,  the  C^’s  "distribute"  information  about  the  "motion"  of  the  total 
system  relative  to  each  fast  subsystem.  The  aggregate  interactions,  B(u)T  in 
(6.59),  are  "weighed"  accordingly  and  the  updated  cost  g^'(ui),  i=*l, —  ,N  is 
obtained.  It  is  this  component  of  the  cost  g^(u)  in  (6.59)  that  allows  the 
optimizations  of  (6.55)  to  be  carried  out  independently.  Next,  (6.51)  is 
solved  for  h(C^)  given  g^(u)  and  the  cycle  continues.  Convergence  occurs  when 

h(Ck)  -  h(Ck)  <  5 
n  —  n 

where  o  is  some  design  tolerance.  Thus,  the  aggregate  serves  as  a 
"coordinator"  passing  and  receiving  the  necessary  information  between  sub¬ 
systems  allowing  the  subsystems  to  compute  "local"  controls  that  are 
"globally"  near  optimal.  Graphical^/,  the  aggregate  coordination  scheme  is 
given  in  Figure  6.1. 

We  now  show  that  the  limiting  policy  (k-  »)  in  (6.55)  satisfies 
optimality  equation  (6.23).  Thus,  the  near  optimality  results  of  Section  6.3 
hold . 

Lemma  6.1.  Let  u^  be  the  near  optimal  policy  obtained  using  the  two-time 
scale  algorithm.  Then  if  u  is  unique,  J(u  )  satisfies  (.6.30). 


h,  (Ck) * min{V. (u1)gk(u1)} 
in  ^l 

u 


Proof:  When  the  algorithm  converges,  (6.55)  becomes 


J(u  )  min  (A.  (u^)C^  +  g(u^)} 

n  1  1  1 

.  u  €U 

• 

J(u  )  1  ■  min  {A„(uN)CN+ g(uN)} 

n  Uy 

or  in  vector  form 


where 


J(u  )  1  »  min{A(u)C_+ g(u) } 
n  uEII  F 


(6.60) 


(6.61) 


(6.62) 


is  the  vector  of  dual  variables  resulting  from  the  fast  subproblems .  We  now 

show  how  C  relates  to  the  expansion  terms  of  Section  6.1. 
r 

From  (6.59),  (6.61)  becomes 

J(u  )  1  -  min{ A(u)Cr.  +  B(u)TC  +f(u)>  (6.63) 

n  -  t  n 

u£U 

and  at  u  *  u  , 

n 

J(u  )  1  =  A(u  )C_  +  B(u  )TC  +  f(u  ). 

n  —  n  F  n  n  n 

Prenultiplving  by  V(u^)  and  W  we  obtain 


(6.64) 


B 


119 


which,  from  Section  6.2,  uniquely  defines  C_  as  SC*  and  C  as  C°.  Thus, 

»  ’  J  F  y  n  n 


(6.63)  becomes 


J(u  )  1  -  min{ A(u) SC1  +  B (u)TC°  +  f (u) } 
n  "  u6U  Y  n 


(6.67) 


which  is  equivalent  to  (6.23).  Thus,  if  u  is  unique,  then  J(u  )-J*»0(c  ). 

n  n 

The  attractive  computational  aspects  of  the  algorithm  are  the  same 
as  the  algorithm  for  the  discounted  problem  and  are  listed  at  the  end  of 
Section  5.4.  We  close  this  chapter  and  this  thesis  in  the  next  section  with 
an  example. 


6.5.  Example  -  Minimizing  the  Average  Cost  Per  Stage 

In  this  section  we  given  an  example  of  applying  the  algorithm  pre¬ 
sented  in  Section  6.4.  The  controlled  Markov  chain  we  will  be  considering  has 
the  following  state  transition  matrix: 


(6.63) 


where  the  states  are  defined  by 


X2 

X3 

X4 

X5 

X6 

x7 

X8 

X9 

G 

2 

2 

2 

1 

1 

1 

0 

0 

0 

D 

0 

1 

2 

0 

1 

2 

0 

1 

2 

(6.69) 


The  control  problem  can  be  visualized  as  one  of  maintenance  scheduling.  From 

(6.69)  the  state  of  the  process  is  defined  by  the  variables  G  and  D.  G 
corresponds  to  the  number  of  power  generating  units  available  while  D 
corresponds  to  the  demand  in  terms  of  generating  units  needed.  Markov  models 
of  this  type  are  common  in  optimal  resource  scheduling  problems  [22,59]. 

The  units  have  a  common  failure  rate  over  which  we  have  no  control. 

On  the  other  hand,  we  can  control  the  rate  of  repair.  By  increasing  the  repair 
rate,  the  probability  of  repairing  the  generator  during  the  next  unit  time 
interval,  y,  improves. 

The  higher  probability  of  repiar  is  obtained  only  by  increasing  the 
system  cost  due  to  assigning  labor,  new  parts,  etc.  Therefore,  the  control 
variable  will  be  the  scalar  quantity  u  taking  values  in  the  closed  interval 
.02<y<  .2.  When  the  process  arrives  in  one  of  the  nine  states  identified  in 

(6.69) ,  the  amount  of  maintenance  scheduled  is  proportional  to  y.  Hence,  the 
control  problem  is  to  find  the  policy  u(G,D)  that  minimizes  the  average  cost 
per  stage  (6.1)  with  instantaneous  cost 

f (G,D,y  (G,D) )  -  [  (D-G)  +  ]“  +  K(u(G,D))“  16.70) 

4-  -r  1 

where  Co)  *max(0,b).  This  cost  is  composed  of  two  terms :  [(D-G)  ]  penalizes 

2 

for  not  meeting  the  demand  while  n(u(G,D))“  penalizes  maintenance  costs.  Thus, 
the  problem  is  well  defined  and  possesses  a  nontrivial  solution. 


Note  that  (6.68)  is  of  the  form  ^+B+I  where 


r-- 
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where  e*  .2  in  (6.68).  First  the  problem  was  solved  using  K*30  in  (6.70). 
Using  the  algorithm  of  Section  6.4,  the  following  near  optimal  policy  was 


obtained 


For  the  purpose  of  comparison,  the  optimal  policy  u*  was  computed  for  several 
values  of  e  including  e*  .2.  These  results  are  given  in  Table  6.1. 


Table  6 
Table  6 


Table  6.1.  Optimal  policies  and  cost  for  K»  30 


e 

.5 

.2 

.15 

.1 

.05 

.01 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.092 

.104 

.106 

.108 

.110 

.111 

U*(G,D) 

.107 

.111 

.111 

.111 

.111 

.111 

.178 

.111 

.107 

.102 

.097 

.092 

.088 

.126 

.134 

.143 

.152 

.16 

.138 

.144 

.144 

.144 

.144 

.143 

.2 

.149 

.136 

.121 

.104 

.090 

J* 

e 

.65776 

.671 

.67145 

.67095 

.66913 

.66648 

_ 

J(un) 

.69722 

.6811 

.67767 

.67399 

.66999 

.66654 

1  clearly  illustrates  the  convergence  of  u*  to  u^  as  s-0.  Also  in 

2 

1,  note  that  J(u  )-J*  <  e  for  all  values  of  z. 
n  =- 


To  see  the  effects 


of  changing  the  cost  coefficients  in  (6.70),  let  K»20..  The  near  optimal  policy 


u  is  then 

n 


u 

n 


'.02  * 
.02 
.02 
.140 
.140 
.108 
.200 
.190 
.102 


(6.74) 


and  the  optimal  policies  for  various  values  of  £  given  in  Table  6.2. 


Table  5.2.  Optimal  policies  for  K«20 


7  1 

.5 

.2 

.15 

.1 

.05 

.01 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

CM 

O 

• 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

.02 

U*(G,D) 

.114 

.129 

.132 

.135 

.138 

.140 

.135 

.139 

.140 

.140 

.141 

.141 

.165 

.140 

.133 

.126 

.118 

.110 

.116 

.167 

.178 

.191 

.200 

.200 

.185 

.192 

.193 

.193 

.192 

.191 

.200 

.200 

.180 

.158 

.122 

.108 

These  results  confirm  our  intuition  in  that  lower  values  of  K  result  in  more 
maintenance  scheduled  (higher  values  of  u(G,D)).  From  a  computational  point 
of  view,  the  advantages  of  the  two-time-scale  algorithm  over  the  standard 
method  were  significant.  For  this  example,  these  advantages  can  be  summarized 


as  follows: 


1)  Memory  requirements  -  The  ability  to  aggregate  interactions  in 
the  form  of  BT  combined  with  having  to  solve  only  subsystem 
problems  reduced  on  line  storage  requirements  by  about  a  3  to  1 
ratio. 

ii)  Decentralized  computations  -  The  computations  are  performed  in  a 
decentralized  manner  allowing  distributed  computing  when  needed, 
ill)  Ill-conditioning  -  As  e  becomes  smaller,  the  orders  of  magnitude 
difference  between  and  B(u)  causes  the  high  order  problem  to 

converge  very  slowly.  The  two-time-scale  solution  is  independent 
of  e  and  converges  rapidly. 

iv)  Computation  time  -  Due  to  both  the  reduced  dimension  of  the  sub- 
system  problems  and  the  improved  conditioning  of  the  two-time¬ 
scale  algorithm,  CPU  time  was  significantly  reduced.  For  e  =*  .2 
the  reduction  in  CPU  time  was  about  3  to  1,  for  e  »  .1  about  6  to  1 
and  for  e«  .05  about  12  to  1. 

In  conclusion,  this  simple  9th  order  example  has  clearly  demon¬ 


strated  the  usefulness  of  the  two-time-scale  algorithm.  For  more  complex 
controlled  Markov  chain  problems  [21,59,62],  the  advantages  of  this  algorithm 
will  be  of  even  greater  significance. 


CHAPTER  7 


CONCLUSIONS 

The  need  to  analyze  and  control  large  scale  systems  is  a  research 
area  that  will  always  remain  rich  in  potential.  More  demanding  performance 
leads  to  more  complex  models  necessitating  the  research  and  development  of 
improved  analysis  and  design  techniques.  This  thesis  has  helped  to  unify 
one  area  of  such  research  and  open  up  another  potentially  promising  new  area. 

In  Chapters  2  and  3  we  considered  deterministic  linear  time- 
invariant  systems  and  the  existence  of  a  "two-time-scale"  property.  It  was 
shown  that  the  reduced  order  modeling  obtained  through  singular  perturba¬ 
tions  was  directly  related  to  dominant  left  and  right  eigenspace  power 
iterations.  This  led  to  a  unified  design  methodology  for  reduced  order 
modeling  and  control  of  two-time-scale  systems  from  which  many  previous  design 
methods  were  shown  to  have  been  special  cases . 

In  the  remainder  of  this  thesis,  we  considered  stochastic  systems 
which  can  be  modeled  as  large  finite  state  Markov  processes.  The  "weak"  and 
"strong"  transition  probabilities  characteristic  of  many  Markov  chain  models 
was  interpreted  as  a  two-time-scale  property  through  singular  perturbation 
modeling.  This  led  to  the  concept  of  a  reduced  order  "aggregate"  Markov 
chain.  This  enabled  reduced  order  asymptotic  series  solutions  to  be  obtained 
for  the  steady  state  probability  distribution,  a  problem  frequently 
encountered  in  queueing  theory.  The  aggregate  model  is  then  used  to  obtain 
near  optimal  policies  for  controlled  Markov  chain  problems.  The  resulting 
optimization  algorithms  are  decentralized  in  the  sense  that  fast  subsystems 
compute  their  controls  "locally"  with  the  aggregate  coordinating  necessary 
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information  between  subsystems  on  a  slow  time  scale.  This  avoids  much  of  the 
computational  burden  associated  with  many  controlled  Markov  chain  problems. 

The  performance  functions  considered  were  the  discounted  cost  and  the  average 
cost  per  stage. 

The  potential  areas  for  further  research  and  applications  are 
numerous.  Performance  analysis  of  computer  systems  and  numerical  solutions  to 
complex  stochastic  control  problems  seem  to  be  the  most  immediate  application 
areas.  Research  areas  include  incorporating  dominant  transient  states  [25] 
into  the  two- time-scale  model  (5.1),  decentralized  control  over  a  finite  time 
horizon,  and  Markov  modeling  techniques  that  result  in  the  form  (5.1).  Many 
new  results  are  now  starting  to  appear  in  these  areas  [25,57,61,62].  The 
theoretical  richness  of  this  research  area  combined  with  numerous  potential 
application  fields  should  result  in  many  more  significant  contributions  to 
Markov  modeling  and  Markovian  decision  processes  in  the  near  future. 
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